Systematic trend following has, on average, been profitable for at least two centuries; yet since approximately 2009, short-term trends have ceased to deliver reliable returns. Using a cross-section of roughly 100 liquid futures contracts spanning 1995-2025, together with an industry-representative CTA proxy, we document the break and characterise its dependence on signal speed and asset class. We evaluate four candidate explanations - capacity constraints, market electronification, a regime change in CTA-versus-order-flow interactions, and a microstructural mechanism - and find that the first three fail on grounds of timing, magnitude, or cross-sectional heterogeneity.
Our central empirical finding is that the cross-sectional variable distinguishing degraded from surviving trends is the volatility-normalised tick size: post-2008 trend PnL has collapsed on small-tick contracts across all signal horizons, while remaining essentially intact on large-tick ones. Neither asset class nor liquidity replicates this dichotomy.
We interpret this result through a self-fulfilling feedback loop that, in our view, lies at the heart of the trend anomaly itself: trend signals trigger directional trades, whose market impact reinforces the very price moves that generated the signal. Both the profitability and the persistence of trend are sustained by this impact channel, which requires that trend followers can execute aggressively at reasonable cost. We argue that the post-crisis transition to HFT-dominated market making, whose liquidity-withdrawal behaviour in front of predictable directional flow has sharply contrasting consequences for sparse (small-tick) and dense (large-tick) limit order books, has broken this loop on small-tick contracts. On large-tick contracts, residual depth remains sufficient, and the loop continues to operate.
Timing-based tilts across asset classes can drive much of the risk and return of a diversified cross-asset portfolio. The standard approach forecasts returns and then optimizes weights. We instead study an end-to-end AI-based policy that maps market states directly to portfolio weights, and we then ask when this one-step modeling approach outperforms simple rules-based strategies. We train these policies on the sixteen most liquid CME futures, where an edge is unlikely to be due to illiquidity, using a differentiable Sharpe ratio loss function, and we benchmark them against equal weighting, risk parity, and time-series momentum. The learned policies rank above the rules on the pooled cross-asset portfolio and in several sub-asset classes, but not uniformly. In gross terms, an LSTM and a transformer-based architecture perform comparably out-of-sample, but diverge when we consider transaction costs. The transformer generates the stronger learned policy, trades far less than the LSTM, and matches or exceeds equal weighting through moderate cost.
LLM agents are increasingly cast as autonomous portfolio managers, and benchmarks have moved from financial question-answering to sequential trading. Yet most still rank agents by returns over a fixed window -- a weak proxy, since a period's return is dominated by the market path and apparent alpha can dissolve once look-ahead leakage is controlled. Such a ranking certifies neither sound reasoning, nor a consistent strategy, nor a durable edge. We introduce CLQT, which reframes closed-loop trading evaluation as diagnosis rather than ranking: an instrument that localizes where and why an agent's process succeeds or fails. CLQT is a fully closed-loop, cost-aware, strategy-consistent, temporally-gated environment whose agents run a five-stage cycle: gather, synthesize, allocate, execute, reflect. Each round emits a complete DecisionRound sealed into a recompute-verifiable hash chain, so every metric is reconstructable from the trail. Six pillars form the substrate: a hard TimeGate, institutional transaction- and financing-cost modeling, strategy-consistency scoring, three-tier memory, a Model-Context-Protocol tool layer, and mandate-aware synthesis. The same agent runs as a constrained committee of specialized roles or a single full-autonomy orchestrator, making process scaffolding an experimental variable. From the audit trail we compute a five-axis capability scorecard (APM-CS: Coherence, Acuity, Composure, Discipline, Reliability), with Coherence judged partly by a held-out, out-of-cohort LLM to curb self-preference bias. We validate it on a contamination-controlled multi-model backtest with an ablation grid and a live broker track on unseen, post-cutoff data, against a repeated-run noise floor. CLQT separates outcome from capability, yielding not a model ranking but a durable, extensible map of agent competencies and limitations.
An exact identity shows invariance to scale and (p-1) other directions, giving a sharper constant under heavy tails.
abstractclick to expand
The global minimum-variance portfolio (GMVP) is the canonical decision built from an estimated covariance matrix, yet covariance estimators are universally evaluated by matrix-norm loss, which is not the object the decision depends on. We characterise exactly how covariance-estimation error maps into GMVP suboptimality. We prove an exact regret identity and a non-asymptotic bound showing decision regret depends on the estimation error only through its action on the portfolio weights, scaled by portfolio concentration and the conditioning of the true covariance. From this we derive the decision geometry: GMVP regret is invariant to a (p-1)-dimensional projection of the p^2-dimensional error matrix, with invariance to the covariance-scale direction as an exact special case. We then apply the framework to heavy-tailed returns (tail index kappa in (2,4)), establishing the regret convergence rate implied by the centred operator-norm rate, and confirm the theory on a skew-t/t-copula simulation design with pre-registered analysis. The decision-focused advantage is a sharper constant and a concentration discount rather than a faster rate; we report an honest high-conditioning boundary of the rate prediction. The results complement recent decision-focused learning approaches by supplying the exact estimation geometry and consistency theory they lack.
We study the squared price-of-risk premium of a portfolio -- an integrated conditional squared Sharpe-ratio functional, not an expected excess return -- and its attribution to causal drivers. Relative to a declared admissible benchmark it decomposes into intervention-stable premium, a signed causal distortion (the confounding wedge), and a nonnegative information loss; the loss is an $L^2$ projection residual, the wedge is not. The decomposition is well posed exactly when the driver filtration is immersed in the price filtration. It need not aggregate across portfolios pooling drivers: we identify an order-three obstruction that is invisible to every singleton and pairwise admissibility screen -- each one- and two-driver sub-book is immersed while the pooled triple reveals a future innovation -- the analogue of Bernstein's pairwise-but-not-mutually-independent triple, and minimal relative to such pairwise diagnostics. We separate its two ingredients, combinatorial masking and anticipative coupling. The failure is one of immersion, not of no-arbitrage. Experiments on synthetic single- and multi-driver panels show the decomposition and its causal correction are estimable, and that a permutation-calibrated screen detects planted order-three leakage with controlled false positives.
This paper compares different methods for forecasting the term structure of U.S. and European zero-coupon government bonds using both traditional econometric and Machine Learning (ML) approaches. We compare classical models (e.g., Dynamic Nelson-Siegel (DNS) and Principal Component Analysis (PCA)) with different Neural Network (NN) architectures, including those inspired by the classical models, on the U.S. Treasury market and bonds issued by the European Central Bank (ECB). To enhance predictive performance, macroeconomic variables are incorporated. The findings for both markets are separately analyzed and compared. To this end, we propose a robust model evaluation framework combining statistical accuracy metrics - such as RMSE, MAE, and directional accuracy - with the economic relevance of a quantitative bond trading strategy. Results show that NNs consistently outperform traditional models in both forecasting accuracy and portfolio performance. For the U.S., the most effective approach is a direct-forecasting NN that incorporates DNS factors to reduce the dimensionality of zero-rate data and an Autoencoder (AE) to extract macroeconomic features, while for Europe, the optimal model is a factor-based NN using PCA-derived zero-rate factors without the integration of macroeconomic variables. Overall, the paper demonstrates how combining traditional modeling approaches with modern ML techniques and evaluation can improve yield curve forecasts and support applications in fixed-income portfolio construction.
This paper examines portfolio optimization for commodity exchange-traded funds (ETFs) under heavy-tailed return behavior. Using daily Bloomberg data for 30 U.S.-listed commodity ETFs from 12 December 2018 to 16 December 2024, we study funds spanning agriculture, energy, metals, and broad commodity index exposure. We compare a passive buy-and-hold portfolio with rolling-window optimized portfolios formed under mean--variance and conditional value-at-risk (CVaR) criteria, considering both long-only and restricted long--short strategies. The results showed substantial heterogeneity across commodity sectors, with energy and broad commodity index funds displaying pronounced volatility, skewness, and excess kurtosis. Historical optimization indicated that minimum-risk and CVaR-based portfolios provided more stable cumulative performance than tangent portfolios and generally improved Sharpe, Calmar, and STARR$_{0.95}$ ratios. Extreme-value diagnostics showed that optimized portfolios remained exposed to heavy downside tails, so improved risk-adjusted performance did not eliminate extreme-loss risk. A dynamic extension based on ARMA--GARCH marginal models, Student--$t$ copula dependence, and one-step-ahead predictive scenarios improved performance mainly when combined with minimum-risk or CVaR-based objectives. Dynamic mean--variance tangent portfolios performed less reliably, reflecting sensitivity to expected-return estimation error. Transaction-cost robustness checks further showed that the practical value of dynamic optimization depended on turnover control, with low-turnover dynamic CVaR tangent portfolios remaining more resilient to implementation costs. Overall, the analysis showed that commodity ETF allocation benefited most from conservative and downside-risk-aware optimization, while optimized portfolios continued to require explicit tail-risk and implementation diagnostics.
Commodity futures can be represented hierarchically, with underlying assets at the upper level and individual futures contracts at the lower level. Entities at each level can be connected by edges reflecting inherent correlations, with cross-level edges capturing contract-to-underlying asset connections. Building on our observations of these structures, we propose a hierarchical graph learning approach for calendar spread (CS) strategies in commodity futures markets, addressing two significant gaps in the machine-learning literature: (i) the absence of learning-based methods for CS strategies in futures markets, and (ii) the lack of consideration of maturity-dependent interrelationships across commodity futures. We first establish the efficacy of CS strategies by analytically showing that CS strategies can possess higher risk-adjusted returns, measured by the information ratio, and lower risk, measured by variance and delta, than long-only strategies. We then introduce a method to convert learning-based predictions into CS positions. Next, we develop a hierarchical graph learning method that predicts futures price movements by utilizing the maturity-dependent interrelationships, thereby yielding a CS trading algorithm. Empirical results on commodity futures markets traded on the Chicago Mercantile Exchange Group demonstrate that our method outperforms benchmark models in both prediction and trading performance. We find that maturity-dependent interrelationships across commodity futures are instrumental in prediction and that CS trading based on hierarchical graph learning is effective for statistical arbitrage.
This paper proposes a two-stage decision support system for long-short portfolio optimization under environmental, social, and governance (ESG) considerations. In the first stage, assets are evaluated using a multi-criteria procedure based on TODIMSort, with criterion weights derived using the MEREC (Removal Effects of Criteria) method. This allows assets to be assigned to classes ordered according to preferences that respond to market conditions and investor priorities, thus generating sets of long and short opportunities that dynamically adapt to the prevailing regime. In the second stage, we formulate a non-convex portfolio optimization problem that maximizes the Omega ratio while respecting budget, bound and leverage constraints. To solve it, we introduce an adaptive particle swarm solver equipped with a controller that selects, at each iteration, the most suitable recombination operator from a diverse pool of operators and combines it with a projection-based repair mechanism for constraint management. The empirical study, conducted on 421 stocks in the STOXX Europe 600 index, examines both the exploration capabilities and solution quality of the proposed solver compared to state-of-the-art benchmarks, as well as the ex post profitability of the resulting portfolio strategies. The results show that ESG-enhanced long-short portfolios offer competitive and often superior performance compared to their non-ESG counterparts and the market-value-weighted benchmark.
It meets a 25-minute deadline for a 10,000-instrument universe where the OSQP baseline completes only 4 of 500 accounts.
abstractclick to expand
Institutional rebalancing is a batched optimization workload with a hard operating deadline: hundreds of accounts need new weights under budget, turnover, exposure, exclusion, and tax-aware controls before trading can proceed. This paper evaluates Asymmetry PRISM, a CPU/GPU portfolio optimization engine, through a public evaluation boundary; problem data in, and returned weights, status codes, timings, memory class, external feasibility diagnostics, eligible objective comparisons, and audit records out. Within that boundary, the evaluation protocol fixes hardware and software versions, declares timing lanes, separates cold single calls from repeated workloads, and admits objective-gap claims only where an eligible reference solver completed. On completed multi-solver rows from N=100 to N=2,000, Asymmetry PRISM-CPU is 4.5x to 24.1x faster than the fastest completed reference row in the same lane. In the production queue study, Asymmetry PRISM-GPU completes 500/500 accounts over a 10,000-instrument universe in 109.5 s within a declared 25-minute operating window, with zero missed deadlines and an audit record for every solve; the recorded OSQP queue baseline completes 4/500. On an operationally constrained real-data suite (tax-motivated transition penalties, restriction caps, turnover controls, batches), Asymmetry PRISM clears constrained solves 3.4x to 126.7x faster than the best completing incumbent at certified-equal objectives, and the GPU route widens to 8.8x over the CPU route at N=384,800. Rows without a completed reference are reported as feasibility, timing, memory, and failure-status evidence.
This paper develops a reinforcement-learning approach to continuous-time risk-sensitive benchmarked asset allocation in a partly model-based setting. The benchmarked problem does not directly fit the standard Markovian stochastic-control template: the state is uncontrolled, whereas the terminal reward contains a controlled It\^o integral. We use free energy-entropy duality to reformulate the problem as a linear-quadratic-Gaussian stochastic differential game under an equivalent probability measure, yielding explicit finite- and infinite-horizon saddle-point solutions. This structure guides a continuous-time $q$-learning actor-critic method: the quadratic value function motivates the critic, while the affine saddle-point controls motivate deterministic actors for the portfolio allocation and adversarial control. The learned allocation admits an economic interpretation through fractional Kelly decompositions. A proof-of-concept implementation calibrated to U.S. equity data shows that the actors learn the optimal policy with high accuracy and reveals a favorable asymmetry: the portfolio actor receives a cleaner learning signal than the auxiliary adversarial actor.
We propose a framework for designing Target-Date Funds (TDFs) around an explicit return objective while controlling risk directly at the portfolio level through a declining Conditional Value-at-Risk (CVaR) constraint. In this approach, the regulator or sponsor specifies a CVaR glidepath that gives the portfolio manager enough flexibility to reach a target return with a reasonably high probability. The target return is determined exogenously from pension-design inputs such as retirement age, contribution rate, working years, life expectancy, and replacement-rate goals. This differs from conventional TDF design, where age-dependent asset-class limits are set without an explicit link to a required return.
A key feature of the method is that it does not assume the manager selects an optimal portfolio each period. Instead, each month the manager draws an allocation from the set of portfolios satisfying the CVaR constraint. This yields a conservative evaluation of each glidepath: success probabilities are averages over admissible allocations, rather than best-case outcomes. We introduce two figures of merit: the probability of meeting the target return and the cumulative risk assumed over the life of the TDF.
As a proof of concept, we apply the framework to Chile's 2025 pension reform using nine Chilean and global asset classes and a 40-year accumulation horizon. The results show that the transition age at which risk starts to decline is the most consequential design parameter, and that contribution density acts as a hard constraint: below a critical threshold, portfolio design alone cannot compensate for structurally low contributions. The framework is general and can be applied to any TDF designed around an explicit return objective.
Static HPO projections induce base policies in RLPO whose myopic value gaps are priced by a performance-difference identity.
abstractclick to expand
Practitioners allocate capital with forecast-light rules such as equal weight, inverse volatility, risk parity, HRP, and return-adjusted HRP (RA-HRP). This paper develops \emph{Heuristic Portfolio Optimization} (HPO): an information-restricted projection of the Markowitz/tangency solution onto a stable rule class. The implied-return principle, $\w$ is maximum-Sharpe iff $\bmu_e\propto\bSigma\w$, gives closed-form optimality sets for leading heuristics and exposes the Schur-complement substitutions behind HRP. For RA-HRP, we introduce fixed-tree cluster-Sharpe recursion, unit-free HRP--RA-HRP interpolation, tangency conditions, conditional-risk splits, and pathwise/KL decompositions of weight distortion. First-order Sharpe calculus expresses the marginal value of return information as nodewise alphas against HRP and yields a linear KL trust budget. We formalize generic HPO maps, define the implied-return defect, prove that it equals squared Sharpe inefficiency, characterize tree-HPO coincidence by nodewise mass ratios, and give a bias--variance decomposition for estimated rules. Finally, HPO is embedded into Reinforcement Learning Portfolio Optimization (RLPO): every HPO map induces a deterministic stationary policy; static HPO is the $\gamma=0$ no-friction face of the Bellman problem; RA-HRP supplies a hierarchical policy prior; and dynamic improvement is warranted when continuation value exceeds myopic HPO defect plus frictions. A performance-difference identity prices the myopic value gap, gives an $\varepsilon/(1-\gamma)$ myopia bound, and identifies nodewise alphas as policy-gradient coordinates of the hierarchical actor. Thus HPO is the static optimality layer and RLPO the dynamic control layer. The conditions are GRS-testable, extend to mean--CVaR and expected utility under ellipticity, and become Kelly-growth conditions in diffusion limits.
We consider a continuous time investment problem in a multi-asset Black-Scholes market with the following features: The assets' drifts are not known and constitute a source of model ambiguity. However, there is a prior distribution (knowledge) on the possible drifts. Our investor is ambiguity averse and wants to maximize a mean-variance criterion for the terminal wealth where ambiguity aversion is incorporated in a smooth way. We consider here the criterion introduced in Maccheroni et al. 2013 where the variance is decomposed and each part is weighted differently to account for different levels of market risk and model ambiguity aversion. We use a novel approach to find the optimal dynamic investment strategy within the class of all adapted strategies which allow for learning. We also present a number of numerical results which help to understand how the model parameters affect the optimal investment strategy. In general it turns out that ambiguity averse investors invest less in the risky assets.
We extend the return extrapolation framework of Atmaz (2022) to incorporate two behaviorally realistic features absent from the linear benchmark: saturation in belief updating and asymmetry between gains and losses. We introduce a smooth, nonlinear, asymmetric extrapolation function and characterize the optimal portfolio of a CRRA investor under Heston (1993) stochastic volatility as the sum of a sentiment-distorted myopic demand, a variance hedging demand, and a sentiment hedging demand. The resulting semilinear Hamilton-Jacobi-Bellman equation is solved by two independent numerical methods, a finite-difference ADI scheme with time-step policy iteration and a deep learning-driven iterative scheme. The model generates four investor-level behavioral anomalies: asymmetric responses to gains and losses, attenuated reactions at extremes, excess trading volume, and welfare loss rising with the strength of extrapolation, each of which maps onto documented empirical patterns. Its central finding is that saturation acts as an endogenous correction mechanism: at the same local slope at the origin, the asymmetric nonlinear extrapolator carries a smaller welfare loss than a linear one.
Benchmarking forecasting architectures for daily equity portfolios is not just a prediction exercise. It also asks which model remains usable after preferences, costs, and portfolio constraints are imposed. We build a CRSP daily-stock benchmark for 15 deep and statistical time-series architectures over 2018--2024. The protocol combines common-window decile portfolios, stochastic multi-criteria acceptability analysis, a deployment-adjusted acceptability index, and a constrained quadratic portfolio layer with capacity, beta, industry, risk, leverage, and turnover controls. The index starts from the SMAA rank-acceptability distribution and downweights models whose criteria-level wins produce high portfolio regret; its Gibbs form is characterized as an entropic update from the SMAA prior. Empirically, no architecture dominates the raw benchmark: TransEnc-8 has the largest rank-1 acceptability, 0.352, and no model exceeds about 0.36. Rankings vary with preferences, market state, feature universe, and transaction costs. In the promoted five-model constrained-portfolio comparison, TransEnc-8 is selected throughout, while return-oriented raw rankings can favor TS-RIDGE. Broad-universe decile signals can survive costs, but the baseline constrained-QP net Sharpe at 20 bps is negative for every promoted model. The benchmark supports model selection and diagnosis rather than a standalone trading-strategy claim.
Deep reinforcement learning (DRL) frameworks for portfolio optimization have shown promise for their ability to learn allocation rules dynamically from market data. However, these models fail to account for fat-tailed returns, which characterize actual market behavior with more frequent extreme events. Furthermore, historical data is treated homogeneously, without accounting for temporal importance, leading models to fail during regime changes. We propose a new BAVAR-BLED algorithm that combines methods derived from Bayesian-Averaging Vector Autoregressive (BAVAR) and the Black-Litterman model using Elliptical Distributions (BLED) within a TD3 architecture. BAVAR captures a set of vector autoregressive representations that consider multi-scale temporal features, enabling adaptive allocation decisions based on regime-aware estimates of return expectations and dispersion matrices. These estimates serve as prior inputs to BLED, a model that uses Student's t-distributions, allowing for more realistic fat tail return estimates. The BAVAR-BLED algorithm uses transformer networks for view construction and CNNs for risk-aversion estimates, which modify dynamic allocation decisions based on market conditions. An evaluation of 29 Dow Jones Industrial Average constituents over a decade-long market period shows that BAVAR-BLED significantly outperforms state-of-the-art methods, achieving Sharpe and Sortino ratios of 1.72 and 2.70, respectively, and total returns of 57.26%.
This paper studies a cash-overlay allocation problem between a static growth-defensive risky sleeve and interest-bearing cash. The risky sleeve is fixed as a 50/50 combination of equal-weight growth and defensive ETF baskets, so the cash overlay is evaluated independently of any dynamic growth-defensive style-timing policy. The target is future risky-sleeve return over cash, with the cash leg measured using the contemporaneous cash rate.
I develop two continuous filters. The slow-tail compensation filter targets persistent deterioration in risky-sleeve compensation, especially regimes in which cash yield rises and risky assets remain unstable. The V-shape crash-brake filter targets fast drawdown episodes and subsequent re-entry. The two filters are combined using a fixed max-cash rule, under which the portfolio uses the larger of the two cash weights each day.
On the common 2017-2026 window, the selected-weight max-cash combination earns a 20.45 percent CAGR versus 16.62 percent for the static risky sleeve, and improves maximum drawdown from -33.59 percent to -16.77 percent. A stricter version combines each component's own walk-forward out-of-sample weights. In the main OOS window, the expanding max-cash combination earns 18.05 percent versus 16.09 percent for the static risky sleeve, with maximum drawdown of -22.05 percent versus -33.59 percent. The evidence supports modular continuous cash overlays as drawdown-control tools, while leaving multiple-testing-adjusted inference and real-time variable re-screening for future work.
Exact identity under i.i.d. costs and mean-unbiased Markov policies yields model-free audit for sequential decisions
abstractclick to expand
We study the problem of auditing a black-box algorithmic decision-maker from observable inputs and outputs alone. Our main result is an exact decomposition: under precisely characterized conditions, the cumulative \emph{regret} of a dynamic policy equals the sum of per-period covariances between the cost vector and the policy's decision. This extends the single-period identity of Aldridge~(2026) to the full multi-period setting of stochastic dynamic programming.
We prove the identity holds exactly under i.i.d. costs and mean-unbiased Markov policies, derive closed-form bias corrections for non-stationary and time-varying cases, and establish the discounted-horizon analog. A Bellman recursion for the covariance regret functional connects the result to standard reinforcement learning algorithms; for rolling-window policies, the estimation-error bias is $O(d/w)$.
The decomposition has direct implications for algorithmic auditing in strategic environments: in platform mechanism design, it provides a welfare-based audit metric without access to the agent's private type; in repeated games, covariance reduction is a sufficient condition for policy improvement; in procurement and ad auctions, the bias correction quantifies welfare loss from strategic misreporting. The associated trajectory estimator is consistent, asymptotically normal with HAC variance, and computable in $O(T \cdot nd)$ time. This makes the proposed approach a tractable, model-free audit tool for platform mechanisms, algorithmic portfolio strategies, and any sequential decision system subject to external performance review.
This paper has used European put option to construct the p-index risk measure to evaluate the performance of different investment strategies in China's SSE 50 index and the US SP500 index during 2018-2023. The p-index measures the insurance fee for each insured dollar to guarantee that the asset achieves at least a delta rate of return on a specified future date. It is found that with the fair price strategy, one-week and one-month holding periods can earn more, and among seven economic sectors, materials sector stocks generated highest annualized rates of return: 11.04% (one-week period), 11.93% (two-week period) and 10.18% (one-month period). With momentum and contrarian strategies of one-week holding period, the p-ratio-efficient-contrarian strategy produced the highest annualized rate of return (9.97%), followed by the p-index-inefficient-momentum strategy (9.01%) and the p-index-efficient-contrarian strategy (6.48%), the MCIRS method employing the p-index consistently delivered higher returns than its beta-based approach, and efficient (outperforming) stocks failed to sustain their momentum while inefficient (underperforming) stocks exhibited no mean reversion. It is also found that the p-index-efficient-contrarian strategy outperformed in low-sentiment (low-volume) regimes, while the p-index-inefficient-momentum strategy outperformed during high-sentiment (high-volume) periods. For the five hundred stocks of the US S&P 500 index during 2018-2023, it is found that efficient stocks sustained their momentum while inefficient stocks exhibited mean reversion. The p-index-efficient-momentum strategy produced the highest annualized rate of return (3.69%), followed by the p-ratio-inefficient-contrarian strategy (3.67%) and the beta-efficient-momentum strategy (3.48%).
We test whether large language models (LLMs) add value in commodity portfolio construction when the information set and implementation rules are held fixed across strategies. A Hawkish Agent (inflation-tightening prior), a Dovish Agent (growth-easing prior), a Debate Agent, and a deterministic z-score Rule Agent each receive identical FRED macro z-scores and route their tilt signals through the same portfolio engine. Across 124 weekly rebalancing dates spanning the 2023 U.S. rate peak and the 2024-2025 soft landing, all three LLM strategies outperform the Rule Agent in Sharpe terms; the Hawkish and Debate Agents record the largest gains (\Delta Sharpe = +0.044 and +0.040, both p < 0.10 under a block bootstrap) and preserve a net-of-cost advantage over the passive inverse-volatility benchmark at one-way trading costs up to 30 basis points, while the Rule Agent's thin margin over passive disappears at approximately 5 basis points.The Debate Agent does not outperform the best single agent (\Delta Sharpe = -0.004, p = 0.769); its contribution is bias correction -- averaging out the Dovish Agent's miscalibrated prior -- rather than deliberation-generated return. The performance advantage is concentrated in the soft-landing sub-period, the evaluation window spans a single rate cycle, and the reported $p$-values are unadjusted for multiple comparisons. Within these limits, the results suggest that an LLM acting as a constrained macro-interpretation function can add modest but economically meaningful value over a transparent rule layer, though the margin is small and its persistence beyond this sample is unknown.
Quantum combinatorial optimization offers theoretical advantages for complex financial modeling, but physical implementation on Noisy Intermediate Scale Quantum (NISQ) devices is severely constrained by hardware topology. This study presents a hardware benchmarking analysis between a Hardware Efficient Variational Quantum Neural Network (HE-VQNN) and the Warm Start Quantum Approximate Optimization Algorithm (WS-QAOA) for a hybrid Mean Variance and Conditional Value at Risk (CVaR) portfolio objective. By implementing a novel classical quantum hybrid proxy matrix to bypass the CVaR auxiliary qubit bottleneck, we map up to 16 assets from the NIFTY 50 index onto an IBM heavy hex processor. We systematically quantify algorithmic resilience to the "SWAP tax" incurred during routing. Empirical results reveal a critical operational trade-off: WS-QAOA provides exact theoretical mapping but suffers catastrophic hardware decoherence due to exponential nonlocal gate overhead. Conversely, HE-VQNN preserves hardware coherence but lacks the mathematical expressibility to capture dense tail risk asset correlations. This study exposes the limitations of dense financial optimization on current architectures forces an nonviable choice between algorithmic inexpressibility and hardware decoherence. This is indicative of a deeper limitation as to what can and cannot be done with NISQ computers lacking in all-to-all connectivity.
Indonesian data across ten years shows linear methods suit taxonomy while flexible graphs expose cross-sector links such as commodities.
abstractclick to expand
The collective movement of stock prices harbors complex interdependencies that are conventionally simplified only through a linear lens. This paper explores computed structural network representations in the Indonesian capital market by testing the limits of Pearson correlation and Mutual Information (MI) in unveiling the spectral dynamics of the market. Across 2,328 rolling observation windows from 2015 to 2025, we examine 24 methodological configurations that combine three dependency estimators (Pearson, MI adaptive binning, and MI-kNN), two graph filtering schemes (Minimum Spanning Tree/MST and Planar Maximally Filtered Graph/PMFG), and four community decoders.
The empirical results unveil a fundamental reality: topological richness does not always resonate with sectoral classification precision. The Pearson, MST, and Infomap configuration is shown to remain the most robust foundation for recovering conventional sectoral taxonomy. Nevertheless, when deeper observation demands the exposition of local structures and the weave of heterogeneous communities, the architectural relaxation through PMFG demonstrates its superiority. In the realm of residual information detection, MI adaptive binning appears far more proportional than kNN; histogram-based regularization successfully tames empirical noise without sweeping away traces of non-linear dependency. Ultimately, the synergy of MI and PMFG is not positioned to dethrone the dominance of linear correlation, but rather to provide an essential analytical lens for excavating hidden economic sub-structures -- such as the cohesion of commodity regimes -- that have long transcended the rigid boundaries of the market's formal sectors.
A portfolio is \emph{anticipatory} when its optimizer acts on a richer model than the myopic, price-taking estimator used to calibrate it. Enrichment may be informational, via enlarged filtrations; dynamic, via horizon forecasts; or performative, via the deployment law induced by market impact. We give a decision-theoretic definition for all three cases and measure anticipation by the realized control gap between enriched controller and restricted estimator. The same quadratic geometry separates information, planning value, impact correction, and overfitting.
For log utility under initial enlargement, value is the information-drift energy $\frac12\mathbb{E} \int_0^T\alpha_t^2\,dt$, equivalently mutual information or relative entropy. In mean-variance form, signal value is $\frac{1}{2\gamma}{\rm tr}(\Sigma^{-1}\Omega)$. Dynamic forecast anticipation gives a finite-horizon quadratic premium in the forecast stack, while permanent impact changes the price-taking allocation $\theta_{\rm na} =(\Lambda+\gamma\Sigma)^{-1}\mu$ into $\theta_{\rm an} = (2\Lambda+\gamma\Sigma)^{-1}\mu$ and reveals a spectral phase transition for naive recalibration. The main result is a stacked finite-horizon LQG decomposition: information, forecast, and impact combine into an information trace plus one inverse-precision norm, whose expansion yields the impact term, forecast term, and signed forecast-impact interaction. Sharp angle bounds and an orthogonal nonnegative projection identity resolve the signed term. The stationary extension endogenizes information covariance as Kalman error reduction and carries impact anticipation to an infinite-horizon Lyapunov trace with transaction costs. Finally, the penalty $\frac{1}{2}{\rm tr}(H^{-1}\Sigma_\varepsilon)$ shows that correctly specified anticipation creates value, vacuous anticipation has zero value, and misspecified anticipation is harmful when estimated structure is optimized as true.
We study optimal portfolio choice for a household simultaneously managing a random-deadline goal, such as a medical emergency or job loss, and a fixed-deadline goal such as retirement or college tuition. Under a forced funding rule, in which each goal is paid in full whenever affordable, the household maximizes a weighted sum of the probabilities of fully funding both goals in a Black--Scholes market. We identify two novel effects absent from single-goal models: a growth crowding-out effect, in which precautionary saving for the random goal distorts investment toward the fixed goal, and a deadline pressure effect, in which a compressed saving horizon forces excess risk-taking. A striking implication is that the value function need not be monotone in wealth: a household just above the random-goal threshold is forced to pay it when the shock arrives, depleting its wealth for the fixed goal, and ends up worse off than a slightly poorer household that missed the random goal but kept its wealth intact. This non-monotonicity is absent from all single-goal benchmarks and arises purely from the interaction between the two goal types under forced funding. We further study an optional funding variant in which the household may decline the fixed-deadline goal at time $T$ rather than being required to fund it. We characterize the ex ante option value, i.e., the full time-$0$ value of this flexibility and the terminal option value, i.e., its value at the funding decision node. We find that both options are most valuable at intermediate wealth levels where paying the fixed-deadline goal would substantially reduce the continuation value of the random-deadline problem.
We study an infinite-horizon optimal consumption-investment problem for an investor with Epstein-Zin stochastic differential utility with stochastic investment opportunities in an incomplete market. Risk aversion and intertemporal substitution are separated, and we work in the regime $\theta\in(0,1)$, where there exists a unique generalised utility process for arbitrary non-negative progressively measurable consumption streams. Our main contribution is a variational characterisation of the value function. We show that the value function is the unique minimiser of a functional whose Euler-Lagrange equation coincides with the Hamilton-Jacobi-Bellman equation. Although the functional may be non-convex, the direct method yields existence, and we prove every minimiser is strictly positive, bounded, and classical. A verification theorem identifies any minimiser with the value function and gives feedback representations for optimal consumption and investment policies. The proof combines a change of measure to the myopic probability with uniqueness results for Epstein-Zin BSDEs and a perturbation argument for optimality. Examples with stochastic volatility, Gaussian excess returns, and fat-tailed excess returns illustrate the scope of the framework and its implications for intertemporal hedging.
We consider the problem of estimating the true Sharpe ratio of an asset selected for having the highest observed in-sample Sharpe ratio among many assets. We discuss estimators based on the polyhedral lemma, James Stein shrinkage, debiasing the expected maximum Sharpe ratio, thresholding and empirical Bayes. We test these estimators in simulations, computing bias and root mean square error across different values of sample size, number of assets, and spread and shape of population Sharpe ratios. We also compute rank correlation of the estimators against the underlying quantity, simulating how these estimators might be used to compare or rank the output of different teams which perform this selection process. We find that the James Stein estimator provides the best performance across many different realistic values of the relevant parameters, followed by the GMLEB estimator of Jiang and Zhang. These results are fairly robust to correlation of asset returns, with some caveats.
Financial markets are inherently non-stationary, exhibiting frequent regime shifts and structural changes that render traditional Portfolio Management (PM) approaches ineffective. Existing remedies, such as rolling-window retraining and naive online fine-tuning, are hindered by high computational costs and insufficient knowledge utilization, respectively, resulting in low returns and limited adaptability. Continual learning (CL) offers a promising paradigm by enabling trading agents to accumulate and transfer knowledge across sequential tasks. In this paper, we propose \textbf{Re}gime-aware \textbf{C}ontinual \textbf{A}daptive \textbf{P}ortfolio management (\textbf{ReCAP}), a novel framework that integrates CL into PM to address the challenges of dynamic financial environments. ReCAP employs an adaptive regime detection module to segment historical market data into variable-length regimes, enabling regime-specific learning of policy vectors and the construction of a policy library. During continual trading, a regime-gate module adaptively combines policy vectors from the library based on the current market state, facilitating rapid adaptation to newly detected regimes. Only the regime-gate and the current regime's policy vector are continually updated to preserve useful knowledge effectively. Extensive experiments on five real-world datasets demonstrate that ReCAP consistently outperforms popular baselines, achieving superior returns in long-term investment horizons and rapid adaptation to regime shifts.
Classical portfolio optimization treats expected returns, covariances, and allocations as deterministic. Modern practice replaces at least one by a distribution: a posterior over parameters, a law of future returns, a stochastic allocation policy, or a distributional-robustness set. We call distributional portfolio optimization (DPO) the unified framework in which weights, returns, and parameters are all modeled as probability measures, organized around the joint coupling Gamma_theta(dw,dr) and its marginal triple (W,R,P). The contribution is synthetic and structural: we organize Bayesian, robust, chance-constrained, stochastic-allocation, and distributional reinforcement-learning portfolio methods through this coupling and prove boundary results connecting them, including a portfolio specialization of Wasserstein-CVaR duality, a static no-randomization theorem, a Bayesian credible-radius calibration of Wasserstein DRO, a Gaussian-isotropic second-order conservatism bound, a conditional two-sided rate W_1 = Theta(n^{-(1+alpha)/2}) governed by the local boundary Holder exponent alpha in [0,1], and a risk-shifted distributional Bellman contraction. A controlled experiment shows that across factor models at K in {10,25,50}, the credible-radius rule lands within 3-7 bp of the oracle out-of-sample tail risk and beats a 24-month validation-tuned radius while spending no validation data. On a K=25 DJIA backtest, equal-weight, no-view Black-Litterman, and Ledoit-Wolf shrinkage attain higher Sharpe than every distributional method; the operational claim is therefore confined to calibration-without-validation and turnover, not raw-return dominance.
This paper compares a series of contemporary portfolio construction approaches by employing ten U.S. stocks (TSLA, WMT, BAC, GS, LLY, MRK, GOOG, META, AAPL and XOM) in a time frame from September 2023 to December 2025. The paper explores both basic mean-variance optimization, constrained optimization, Fama French five factor regression modeling, Monte Carlo simulation, and the Black-Litterman model to determine how constraints to a solution, risk factors to a strategy, simulated approximations, and specific market views may all impact the outcome of portfolio allocation, performance and stability. Overall, the results show that standard optimization may result in highly concentrated portfolios, while constrained optimization leads to changes in portfolio allocations by altering the efficient frontier, five factor regression models suggest that a basic investment style of defensive large value and profitability exposure, Monte Carlo approximation is a viable technique to arrive at mean-variance optimal portfolios provided the simulations are high enough especially under a box constraint, the Black Litterman portfolio approach produces more economically intuitive allocations and greater stability compared to standard mean-variance optimization as the approach balances equilibrium returns with investor views.
This study looks at the statistical properties and predictability using deep learning methods of the U.S. aggregate bond index in daily observations spanning 2018 to February 2026. We first establish that index levels are extremely persistent and consistent with unitroot behavior (Dickey and Fuller), while log returns are covariance-stationary with weak linear dependence and pronounced volatility clustering characteristic of ARCH-type processes (Engle; Bollerslev). Motivated by the trade-off between stationarity and information retention, we construct a "stationary but maximally persistent" representation via fractional differencing (Granger and Joyeux; Hosking) following the procedure of L\'opez de Prado, and evaluate shorthorizon forecast using two neural paradigms: (i) Multilayer Perceptrons (MLPs) trained on lagged vectors with joint lag-length and hyperparameter tuning (Hornik et al.; Rumelhart et al.); and (ii) Convolutional Neural Networks (CNNs) trained on Gramian Angular Field (GAF) image encodings (Wang and Oates). Empirically, MLPs match the strong naive persistence benchmark on levels, collapse toward near-zero forecasts on returns, and achieve the strongest incremental performance on the fractionally differenced series, where moderate dependence remains but unit-root drift is attenuated. In contrast, CNN-GAF models deliver consistently negative out-of-sample R 2 across all three representations. Overall, the results imply that, for short-horizon forecasting of broad bond indices, the primary determinant of predictive performance is the transformation of the series-its degree of stationarity and memory-rather than architectural complexity. Lag-based models remain competitive under persistence, while GAFbased CNNs are better suited to pattern-based tasks than to persistence-dominated next-step prediction.
Heston-Bates-CIR calibration to equity options and Euribor shows continuous volatility controls short horizons while stochastic rates affect
abstractclick to expand
This study develops an integrated stochastic modeling framework for pricing short and medium-maturity equity options and assessing interest-rate risk using the Heston (1993), Bates (1996), and CIR (1985) models. We calibrate the Heston model using both the Lewis (2001) Fourier inversion and the Carr-Madan (1999) FFT approach, finding near-identical parameter sets, which is consistent with the calibration stability reported in recent studies such as Agazzotti et al. (2025). Extending the model to Bates shows that jump intensities converge to values effectively equal to zero for 60-day maturities, echoing empirical findings that jumps contribute marginally to short-term smile fitting. We further compare our calibration approach with the joint volatility-surface and variance-term-structure framework proposed by Yoo (2025), confirming that standard Heston/Bates calibration remains robust for the maturities considered. Finally, we calibrate the CIR short-rate model to the Euribor term structure, generating positive and economically consistent forward-rate scenarios in line with recent stochastic-rate option-pricing research by Jeon and Kim (2025). Overall, our results show that continuous stochastic volatility dominates near-term pricing dynamics, while stochastic interest rates materially influence valuations beyond one year.
Large language models (LLMs) have shown strong performance across diverse financial tasks, yet portfolio management (PM), a critical financial decision-making task, remains poorly benchmarked. Existing benchmarks exhibit two main gaps: they ignore cross-asset correlation structures, thereby failing to distinguish genuinely diversified portfolios from concentrated ones, and fail to evaluate the complete PM decision pipeline in real-world scenarios. We introduce PortBench, a benchmark spanning six heterogeneous asset classes over ten years. PortBench consists of two complementary layers: a static QA dataset of 6,269 correlation-based questions across seven task templates, and a dynamic five-stage allocation pipeline that mirrors the full PM decision cycle. To evaluate these layers, we introduce two dedicated metrics: a dual-layer correlation score that measures whether proposed portfolios exploit inter-class hedging and avoid intra-class concentration, and CEPS, a metric that quantifies how reasoning errors compound across pipeline stages. We further assess strategy robustness and investor alignment under three historical stress regimes and risk profiles. Evaluating ten frontier LLMs, we find that despite strong performance on static financial QA, 90\% of model-profile combinations fail to outperform a basic equal-weight allocation, and models that satisfy every procedural constraint still suffer catastrophic drawdowns under stress. Our source code is available at \href{https://github.com/AgenticFinLab/portbench}{this https URL}.
This study develops a regime-aware portfolio allocation framework that integrates Markov switching models with Reinforcement Learning (RL) to dynamically allocate across equities (SPY), long-term Treasuries (TLT), and gold (GLD). Using daily ETF data from 2004-2025, we first characterize market behavior through a discrete Markov chain and then estimate a three-state Gaussian Hidden Markov Model (HMM) selected by the Bayesian Information Criterion (BIC). The estimated regimes-low-volatility, transitional, and high-volatility-exhibit strong persistence and state-dependent return dynamics consistent with recent findings on nonlinear market states (Ardia et al., 2024; Gupta & Pierdzioch, 2023). State-conditional analysis shows that SPY dominates in stable regimes, while TLT and GLD provide protection during stressed periods, motivating regime-conditioned allocation rules.
We evaluate rule-based rotation and RL-driven strategies using a 30% out-of-sample test window with a one-day execution lag to avoid look-ahead bias. Both HMM-based allocations outperform a passive SPY benchmark, while the RL policy achieves the highest risk-adjusted performance, delivering the strongest Sharpe ratio and materially lower drawdowns, yet remains fully interpretable through discrete regime-dependent actions. Sensitivity analysis confirms the robustness of the three-state specification relative to two-state alternatives. Overall, the results demonstrate that RL can systematically enhance HMM-based regime detection, providing a transparent, adaptive, and empirically grounded framework for tactical asset allocation. The combined HMM-RL system provides a transparent, rules-based approach to tactical allocation that improves risk-adjusted performance relative to standard benchmark strategies.
The quadratic framework decomposes concentration into three layers while the same residual operator sets dynamic transmission capacities.
abstractclick to expand
Ownership concentration is not a scalar. For a normalized investor-stock matrix $A$, it has three irreducible layers: concentration across investors, concentration across stocks, and dependence in the joint assignment of investors to stocks. This paper develops a unified quadratic framework for those layers and shows that the same residual operator that measures static overlap also governs linearized market transmission. Raw micro concentration $M(A) = \sum_{i,j} A_{ij}^2$ admits exact row and column decompositions, support bounds, and fixed-marginal extremal characterizations on the transportation polytope. Benchmark-adjusted dependence $\mathcal{X}(A) = \sum_{i,j} (A_{ij} - p_i s_j)^2 / (p_i s_j)$ admits two exact decompositions: it is a size-weighted average of investor-level deviations from the market portfolio and, symmetrically, of stock-level deviations from the investor base. The paper also proves a multiscale aggregation law: under any partition of investors, total dependence splits exactly into between-group dependence and within-group heterogeneity. Spectrally, $\mathcal{X}(A)$ equals the sum of squared nontrivial singular values of the whitened matrix $D_p^{-1/2} A D_s^{-1/2}$. The residual operator $L$ then yields two dynamic consequences: idiosyncratic fire-sale vulnerability is bounded by the dominant overlap mode $\rho(A)$, while aggregate benchmark-relative alpha variance has worst-case capacity $\rho(A)^2$ and isotropic average-case capacity $\mathcal{X}(A)$. The fixed-marginal geometry also motivates a feasible-range sparsity score that benchmarks observed micro concentration against the sharp minimum and maximum implied by the marginals. The resulting framework separates scale concentration, feasible sparsity, overlap, and linear transmission in a way that is mathematically transparent and empirically usable for work on crowding, fragility, and systemic risk.
Multi-agent LLM decision systems for portfolio management still lack a principled way to assign credit across specialist agents, remain vulnerable to cold-start dominance under regime shifts, and offer limited transparency into how final allocations are formed. We propose Market Regime Council (MRC), a cooperative multi-agent decision system that computes exact Shapley credits across all single, pairwise, and Grand-coalition outputs for online agent weighting. Instantiated with N=3 specialist agents, at each trading period, MRC recomputes coalition-based Shapley weights from exponentially weighted performance histories, uses a Bayesian adaptive mixture to stabilize early periods, applies regime-dependent multipliers to adjust agent authority, and records each rebalance through a five-layer causal trace. Over 1,037 trading days across 13 crypto assets and five seeds, MRC achieves a Sharpe ratio of 1.51 and a cumulative return of 440.1%, ranking first on CR, SR, and IR among active baselines and attaining the lowest MDD among active methods. Ablation results show that the gains come from Shapley-weighted integration across coalition outputs rather than from any single stage in isolation. Code and demo data are included in the supplementary material.
We explore the application of LLM-driven algorithm optimization to several common tasks in quantitative finance. MadEvolve, a general-purpose algorithm optimization framework inspired by DeepMind's Alpha-Evolve, was recently developed to optimize algorithms in computational cosmology. Here we demonstrate the utility of MadEvolve to optimize algorithmic trading strategies and alpha generation at the example of Bitcoin trading. On our simulation and backtesting setup, we achieve significant improvements on all tasks we considered, such as evolving feature sets for signal generation, optimizing separate components of the trading strategy, and jointly evolving the feature pipeline together with the execution strategy. Additionally, we compare our method to other agentic search approaches, specifically Claude Code, and carefully evaluate p-hacking probabilities on our simulation setup. Our findings strongly support the utility of AI-driven agentic and evolutionary algorithms for algorithmic trading and quantitative finance.
Institutional crossing platforms face a hidden-information problem: investors value trades as portfolios, but liquidity discovery is typically organized around individual securities. We model portfolio crossing as limited-communication preference elicitation over signed portfolio trades. The platform first uses price-directed demand queries to search the portfolio space and then verifies selected packages through value queries; an incumbent verification query records the demand-discovered allocation before further exploration. Final allocations are chosen from elicited reports, so the learning model guides queries but does not determine welfare. The analysis shows why search and verification are complementary. Demand queries locate high-value regions of a nonseparable portfolio space, but they provide only conservative welfare evidence unless selected packages are verified. Value queries provide exact welfare comparisons, but they are ineffective when applied to poorly targeted packages. Market-calibrated experiments using equity panels from the United States, Korea, Japan, and Germany show that demand-only and value-only designs recover only about half of full-information welfare under a limited query budget, whereas the hybrid procedure recovers 88\% and approaches 95\% as communication expands. We then compare exact security-level packages with factor-completed basket packages within the same allocation rule. Security-level packages are the unadjusted-efficiency mode when exact-securities disclosure is inexpensive. Factor-completed baskets become preferable when pretrade message informativeness is costly. The results characterize portfolio crossing as a selective verification problem and identify disclosure-sensitive package representation as a core design choice for hidden liquidity platforms.
This paper studies conditional allocation between a growth/technology ETF basket, denoted by $G$, and a defensive income/value-oriented ETF basket, denoted by $D$. The objective is not to discover a new standalone alpha factor, but to examine whether known style exposures can be dynamically allocated using macro-market timing signals. Fama-French five-factor plus momentum attribution shows that the relative portfolio $G-D$ is a recognizable style portfolio: its market beta is 0.273, its HML beta is -0.552, its momentum beta is 0.117, and its annualized alpha is 1.95\% with a Newey-West t-statistic of only 0.81. The empirical object is therefore interpreted as a growth-versus-defensive style allocation problem rather than a new return anomaly.
The allocation framework replaces discrete regime labels and if-then trading rules with a continuous smooth score. The score combines rate relief, SPY drawdown depth, high-VIX stress relief, and a growth-crowding penalty. Interaction terms are smoothed with softplus functions, the total score is mapped to G/D weights through a hyperbolic tangent function, and realized weights are smoothed with EWMA. In the main aligned comparison window from June 28, 2017 to May 15, 2026, with 10bp transaction costs, the selected smooth-score policy uses a 50\% maximum active tilt and obtains a 19.24\% CAGR, a Sharpe ratio of 1.01, and a maximum drawdown of -31.63\%. It improves over 50/50 G/D, matched TNX-only, matched core-only, SPY, and volatility-matched 100\% G benchmarks. It does not, however, exceed 100\% G or the best high-G static portfolios in raw CAGR. Walk-forward and post-2022 validations provide additional evidence of drawdown reduction and risk-adjusted allocation value. Overall, the evidence supports continuous, interpretable style timing, while also showing that high static growth exposure remains a strong benchmark.
Graph neural network tests on S&P 500 stocks find that lowest MSE, highest ranking accuracy, and highest Sharpe ratio come from three unlike
abstractclick to expand
This paper tests whether graph neural networks improve realized volatility forecasts and whether those forecasts improve portfolio performance. Using weekly realized volatility for 465 S&P 500 equities from 2015-2025, Heterogeneous Autoregressive and Long Short-Term Memory baselines are compared against GraphSAGE models built on rolling correlation, sector, and Granger-causal graphs, with and without macro regime features. The empirical finding is that the model with the lowest forecast MSE, the model with the highest cross-sectional ranking accuracy, and the model with the highest portfolio Sharpe ratio are three different models. Forecast accuracy, ranking quality, and portfolio performance are related but not interchangeable objectives. Graph volatility models add value only when the portfolio rule can exploit the cross-sectional structure they encode.
Cardinality-constrained portfolio selection is routinely cast as a quadratic unconstrained binary optimization (QUBO) and submitted to a quantum processing unit (QPU) for direct annealing. We show that this standard penalty encoding is the binding constraint for direct-QPU execution on current D-Wave Pegasus and Zephyr hardware. Expanding the exact cardinality penalty contributes a dense rank-one term that makes the logical interaction graph complete regardless of the covariance, producing chain-break fractions from 83% at small universes up to 92% at the full forty-nine-industry Fama--French universe, and zero feasible raw samples at every tested scale. Topology-aware sparsification reduces chain breaks to near zero, but any sparsifier that removes off-diagonal entries also dilutes the cardinality constraint; an ablation reveals that this sparsify-and-project pipeline is dominated by the classical projector, not the QPU. We propose removing the penalty entirely: sample an objective-only QUBO built from expected returns and the risk-scaled covariance on hardware, and enforce cardinality classically through a deterministic feasibility projector. Across 4,468 saved embedding records on live Pegasus and Zephyr hardware, spanning equities up to forty-nine assets and football-betting instances up to forty-eight, this penalty-free pipeline reduces mean chain-break fractions from 71%--92% down to at most 0.04%, and post-processed regret is at most 0.03% relative to greedy classical references at every tested scale. We do not claim quantum advantage; the penalty encoding, not the sparse hardware topology, is the limiting factor for direct-QPU portfolio optimization at currently accessible scales.
We audit the operational decomposition of D-Wave's hybrid quantum-classical portfolio-optimization service on cardinality-constrained mean-variance-turnover instances spanning N=10 to 640, with the constraint-native LeapHybridCQM interface, the penalty-encoded LeapHybridBQM interface, and Gurobi MIQP and simulated-annealing classical anchors. We report all three SDK timing fields (t_run, t_charge, t_QPU) and define a candidate four-metric audit protocol for hybrid quantum-classical solvers. Three findings. First, the LeapHybridCQM service matches Gurobi's proven optimum on all 54 head-to-head instances at N <= 120, but the mean QPU access time is 0.034 seconds out of the 5-second nominal wall-clock budget -- 0.68% of the nominal budget, approximately 0.72% of measured run time -- and the remaining ~99% is the service's classical decomposition and feasibility-aware reassembly. Second, in a CPU-only matched-wall-clock counterfactual, TabuSampler on the penalty-encoded BQM reaches final exact-K objectives within mean absolute delta 0.001 of hybrid CQM on 24 tested instances; this does not ablate the LeapHybridCQM pipeline internals, but it shows that these objective levels are reproducible by a classical heuristic at the same wall-clock budget. Third, the cardinality penalty contributes a dense rank-one term that fully connects the encoded logical graph independent of the input covariance density, an effect we prove as a structural theorem; the resulting density-axis collapse explains the BQM degradation observed in the empirical comparison. Out-of-sample on Fama-French 49 industry portfolios, the QPU-selected portfolios deliver a mean Sharpe ratio of 1.94 versus 2.22 for the 1/N baseline. The practical implication is that reported D-Wave hybrid wins on this problem class are constraint-native classical pipelines, not quantum-sampling wins.
This study develops and evaluates a deep reinforcement learning framework for dynamic portfolio allocation across global equity markets. The Soft Actor-Critic algorithm is used to learn continuous portfolio weights within a Markov Decision Process, incorporating transaction costs, turnover penalties, and diversification constraints into the reward function. Five model configurations are compared, varying in reward formulation, policy structure (flat versus hierarchical Dirichlet), portfolio constraints, and temporal encoder (LSTM versus Transformer), and evaluated via walk-forward optimization across sixteen out-of-sample folds spanning 2003-2026 on the Nasdaq-100, Nikkei 225, and Euro Stoxx 50. Results show that RL strategies achieve competitive risk-adjusted performance primarily in the Euro Stoxx 50, where statistically significant abnormal returns are observed, but the central hypothesis is only partially confirmed: no strategy achieves statistically significant excess returns relative to Buy and Hold under HAC-robust inference across all markets. Regime analysis reveals that RL adds the most value during periods of elevated uncertainty, while ensemble aggregation across markets improves risk-adjusted performance and confirms the benefits of geographic diversification.
Portfolio optimization in real-world financial markets is notoriously difficult due to non-stationarity, noisy data, and high transaction costs. Standard predict-then-optimize methods first forecast returns and then solve for weights, compounding prediction errors and often failing under regime shifts. We propose an end-to-end framework that directly optimizes differentiable surrogates of key financial metrics - Sharpe ratio, Omega ratio, Conditional Value-at-Risk (CVaR), and Risk Parity - allowing neural networks to learn portfolio weights via backpropagation. Our expanding-window walk-forward procedure, applied to 50 S&P 500 stocks from 2007 to 2023, incorporates realistic bid-ask spread costs and rebalances quarterly. On the challenging out-of-sample test period (2022-2023), the best model - an AttentionLSTM with the Omega-CVaR-RiskParity loss - achieves an annualized Sharpe of 0.29 and a total compounded return of +7.86%, while the S&P 500 delivers -4.52% total return and an annualized Sharpe of -0.02. This outperforms the S&P 500 by 12.38 percentage points (a relative improvement of over 270%), while keeping tail risk (CVaR) nearly unchanged. The framework consistently outperforms the equal-weight portfolio, S&P 500, and traditional methods (MVP, HRP, NCO), demonstrating that embedding financial objectives directly into model training yields robust, economically meaningful outperformance even in adverse market conditions.
This paper investigates risk measures derived from the expected maximum deficit in a continuous-time framework and develops optimal reserve allocation strategies across multiple lines of business. We formalize the expected maximum deficit and study its associated distortion risk measures. Furthermore, we introduce implicitly bounded risk measures based on the minimal capital required to meet prescribed fixed and proportional risk tolerances, and propose approaches for optimal capital allocation using line-specific distorted expected deficits. Theoretical results established include static coherence and convexity properties, dynamic conditional extensions detailing supermartingale time consistency over a fixed horizon and the evolution of capital requirements across rolling horizons, and exact analytical optimizations of the aggregate minimum reserve.
Loss differentials treated as returns reveal experts avoid big failures while some models win on specific targets.
abstractclick to expand
Average forecast accuracy is not the same as forecast reliability. I treat forecast loss differentials relative to a benchmark as a return series. I then evaluate these returns using risk-adjusted performance measures from finance, including the Sharpe ratio, Sortino ratio, Omega ratio, and drawdown-based metrics. I also introduce the Edge Ratio capturing a model's propensity to deliver uniquely informative predictions relative to the forecasting frontier. I apply this framework to U.S. macroeconomic forecasting, comparing econometric benchmarks, machine learning models, a foundation model (TabPFN), and the Survey of Professional Forecasters. While it is often feasible to beat professional forecasters in terms of average accuracy, it is much harder to beat them on a risk-adjusted basis. They rarely exhibit catastrophic failures and often achieve high Edge Ratios, plausibly reflecting the value of contextual judgment. Nonetheless, selected machine learning methods deliver attractive risk profiles for specific targets. The framework naturally extends to meta-analyses across targets, horizons, and samples, illustrated with a density forecast evaluation and the M4 competition.
ESG-aware portfolio optimization is increasingly important for sustainable capital allocation, yet most learning-based methods still operationalize ESG by appending static scores to the policy observation or reward. This creates a mismatch for sequential control: ESG scores are noisy, provider-dependent, low-frequency, and temporally misaligned with sequential portfolio decisions, while financial evidence suggests that ESG is better treated as a portfolio preference, risk-exposure, or hedge dimension than as a robust alpha factor. We propose to impose ESG constraints without modifying the financial policy's observation or reward, using a Multimodal Action-Conditioned Constraint Field (MACF) that learns mechanism-specific ESG costs from point-in-time multimodal evidence and contemplated portfolio transitions. We then introduce MACF-X, a family of optimizer-specific adapters that converts MACF costs and uncertainties into native constrained-optimization interfaces through a shared slack- and uncertainty-aware pressure layer. Across multiple constraint-integration interfaces, MACF-X reduces tail ESG budget pressure while maintaining competitive financial performance. Ablations show that this improvement depends on dynamic evidence inputs and three-head decomposition, while static ESG-score proxies are nearly indistinguishable from score-shuffled noise baselines.
A path-dependent approach shapes portfolio exposure so recovery after losses requires less upside participation than symmetric de-risking.
abstractclick to expand
Volatility is the language in which finance often describes risk, but it is not the language in which institutions experience risk. Allocators live through drawdowns, liquidity needs, spending rules, rebalance decisions, board oversight, and the interval between a prior high-water mark and full recovery. This paper develops a path-dependent framework for asymmetric volatility management. The arithmetic of recovery is nonlinear: after a drawdown of depth $D$, the required gain is $R=\frac{1}{1-D}-1$. Lower volatility can improve geometric compounding through the familiar small-return approximation $g \approx \mu-\frac{1}{2}\sigma^2$, but symmetric de-risking can also impair recovery if it sacrifices too much upside participation. The relevant design problem is therefore not volatility reduction in isolation; it is conditional exposure shaping. Skew engineering is defined here as the portfolio construction discipline of reducing harmful downside participation more than productive upside participation, controlling submergence, and preserving enough recovery participation to sustain compounding under adverse regimes. The resulting Recovery-Efficiency Protocol links drawdown depth, time underwater, recovery burden reduction, and rebound participation into an allocator-facing reporting discipline. Machine learning and AI methods are framed as tools for conditional estimation, regime mapping, robustness testing, and model-risk governance, not as market prediction.
We study the single-period portfolio selection problem under Constant Relative Risk-Aversion (CRRA) utility through the information-theoretic lens. Assuming only that the market payoff vector has finite support, we show that the Certainty-Equivalent (CE) growth rate under CRRA utility can be decomposed into a portfolio-induced R\'enyi divergence term, a R\'enyi entropy term of the risk-tilted market law, and a log-partition term. In this setting, the R\'enyi order has a clear operational meaning: it exactly coincides with the investor's coefficient of relative risk aversion. We further show that CRRA portfolio selection is equivalent to a R\'enyi information-projection problem. Using a variational representation of R\'enyi divergence, we obtain a Blahut-Arimoto-style alternating optimization with a closed-form auxiliary update and a KL-type portfolio step. In the low risk-aversion regime, this method empirically requires fewer iterations than both direct CRRA utility optimization and Cover's method.
Large-scale portfolio choice is highly sensitive to estimation error, making the preliminary asset selection essential in empirical implementation. Existing selection rules typically rely on scalar returns or low dimensional high frequency summaries, and thus discard intraday risk dynamics that may be relevant for risk adjusted allocation. We propose Metric Dependence Screening (MDS), an asset selection procedure that incorporates high frequency information as object valued data. Each asset day observation is represented as a point-curve object combining daily return with an intraday risk state curve, equipped with a weighted product metric that preserves both reward information and within day risk dynamics. MDS ranks assets by a Fr\'echet variation based dependence score, measuring how much a risk adjusted target explains the metric dispersion of the asset representations. This yields a simple two stage portfolio procedure: MDS first reduces the investable universe, and standard mean-variance or minimum variance allocation is then applied. We develop a target slicing estimator and establish concentration, sure selection, and rank consistency guarantees under $\alpha$-mixing time series dependence and ultrahigh dimensionality. Simulations show that MDS performs well across both Euclidean and non-Euclidean settings. Using high frequency data for $2938$ Chinese A-share stocks from July 2023 to December 2025, we demonstrate that MDS improves out of sample portfolio performance over return based and scalar dependence based benchmarks, highlighting the value of preserving intraday risk dynamics.
Decision-focused learning (DFL) is attractive for portfolio optimization because it trains predictors according to downstream decision quality rather than prediction accuracy alone. However, SPO(Smart, Predict then Optimize surrogate)-based DFL may produce inflated return signals and unstable portfolio reallocations. This study provides a KKT-based interpretation showing that portfolio decisions can be viewed as ranking over risk- and transaction-cost-adjusted marginal scores. Empirically, we examine prediction inflation and excessive turnover in SPO-trained portfolios, and evaluate clipping, min-max rescaling, and partial portfolio adjustment as practical stabilization mechanisms. The results suggest that realistic output constraints and portfolio-level turnover control improve the implementability of SPO-based portfolio strategies.
Modern stochastic optimization pipelines increasingly rely on learned generative models to represent uncertainty, while downstream decisions are evaluated almost entirely through Monte Carlo scenarios. This shifts the operational object of uncertainty from an explicit probability law to the sampler induced by the learned generator. Reliability therefore depends on two errors: sampler misspecification and finite-simulation error. We propose Sampler-Robust Optimization (SRO), which optimizes decisions against the worst-case sampler induced by perturbing the learned generator. This sampler-first formulation aligns with simulation-based decision pipelines and admits a sharpness-aware interpretation: it favors decisions whose performance is stable under generator perturbations, rather than merely under the nominal sampler. Under a coverage assumption, we show that the empirical worst-case objective provides a high-probability upper certificate for the true population objective, with finite-simulation error partially absorbed by the robustification used to guard against sampler misspecification. The framework accommodates generative models with or without explicit densities and admits efficient minimax procedures. Portfolio-optimization experiments show that SRO produces more stable decisions and improves out-of-sample performance under distribution shift.
Counterintuitively, the S&P 500 Index rose between January 1, 2022, and December 29, 2023, while exchange-traded funds (ETFs) seeking to deliver 2x and 3x daily returns of the index delivered substantially negative returns. Roughly two-thirds of the difference between the returns of the index and the levered ETFs can be attributed to compounding and volatility. The remaining difference is explained by the covariance between the ETFs' deviations from constant leverage and the index's return.
Simulations show predictive control gains nothing from random shifts but exploits known patterns in marketing budgets.
abstractclick to expand
We study finite-horizon budget allocation as a closed-loop economic control problem and evaluate receding-horizon Model Predictive Control (MPC) relative to reactive budgeting policies. Budgets are allocated periodically under execution noise and operational constraints, while return efficiency may evolve over time. Using a controlled simulation framework motivated by digital marketing, we compare reactive pacing to MPC across environments with increasing degrees of non-stationarity. Our results show that non-stationarity alone does not justify predictive control. When return dynamics are stationary or evolve through unpredictable stochastic drift, MPC offers no systematic advantage over reactive baselines. By contrast, when return efficiency exhibits predictable structure over the planning horizon, that is captured through an underlying model, MPC consistently outperforms reactive budgeting by exploiting intertemporal trade-offs.
LLM agents are promising tools for empirical discovery, but their flexibility can also turn discovery into uncontrolled search. We study how to use agents under a reproducible protocol through cryptocurrency factor discovery. Our framework casts the task as sequential hypothesis search: an agent reads an append-only experiment trace, proposes falsifiable factor hypotheses, and maps them to executable recipes, while a deterministic engine enforces fixed data splits, selection gates, transaction costs, and portfolio tests. Candidate actions are restricted to a point-in-time factor DSL, making both successful and failed hypotheses auditable. A ridge-combined portfolio trained only on 2020--2022 data achieves a 44.55% annualized return and Sharpe ratio of 1.55 in the 2024--2026 pure out-of-sample period after a 5 basis point one-way trading cost.
Organizations routinely make strategic budget allocations under operational constraints, but often lack a principled way to assess whether realized allocations were close to the best feasible choices in hindsight. We present a retrospective auditing framework based on hindsight regret, defined as the opportunity cost of the realized allocation relative to a constraint-faithful benchmark under the same budget and stability guardrails. The framework estimates regime-specific spend--response functions from historical logs, computes feasible hindsight allocations via constrained optimization, and propagates uncertainty through Monte Carlo evaluation to produce regret distributions, expected lift, and probability-of-improvement summaries. This separates allocation inefficiency from uncertainty in the estimated response surfaces. Experiments on real marketing allocation logs show that the framework yields interpretable post-hoc diagnostics and reveals a practical trade-off between allocation flexibility and detectability: moderate feasible reallocations often capture most measurable gain, while larger shifts move into weak-support regions with higher uncertainty. The result is a practical method for auditing historical budget decisions when online experimentation is costly or infeasible.
Unrestricted mean-variance-skewness-kurtosis portfolio optimization can capture asymmetry and tail risk, but sample-moment formulations become computationally impractical when the asset universe is large: they produce dense nonconvex quartic objectives with prohibitive coskewness and cokurtosis tensors and anisotropic, ill-conditioned level sets. We develop a structure-exploiting algorithm based on Yau's affine-normal descent that follows affine-normal directions of the current level set while working directly with the return matrix. The method avoids explicit higher-order tensors and exploits the quartic structure for exact sample oracles, derivative evaluation, and exact line search. We also provide theory for the reduced simplex formulation, including regularity and convexity conditions that separate data-map geometry from investor preference coefficients. Computational results show a clear implementation split: a direct configuration is effective on the standard small benchmark, whereas a preconditioned conjugate-gradient configuration with stall recovery becomes the preferred large-scale implementation by the upper end of the hundreds and remains competitive as the asset universe moves into the thousands. On a 5-minute A-share panel with 5,440 stocks, the method makes direct full-universe comparisons with exact mean-variance portfolios feasible and shows on the baseline split that the incremental value of higher moments is strongest at moderate return targets.
Hierarchical Risk Parity (De Pardo) and the Schur-complement generalization of Cotton are among the most widely adopted regularised portfolio construction methods, yet both are signal-blind: they solve only the minimum-variance problem and cannot accommodate an arbitrary expected-return forecast. This paper introduces three methods that incorporate alpha signals into hierarchical and regularised portfolio construction.
HRP-$\mu$ is a hierarchical allocator that accepts an arbitrary signal $\mu$ and nests standard HRP when $\gamma = 0$ and $\mu=\mathbf{1}$. It preserves the tree-based structure of HRP while extending it beyond the minimum-variance setting. HRP-$\Sigma\mu$ strengthens this construction by replacing inverse-variance representatives with recursive local mean-variance optima, thereby using richer within-cluster covariance information at the same $O(N^2)$ asymptotic cost.
CRISP (Correlation-Regularised Iterative Shrinkage Portfolios) is an iterative solver for $P_\gamma w = \mu$ with $P_\gamma = (1-\gamma)\operatorname{diag}(\Sigma) + \gamma \Sigma$, so that $\gamma$ interpolates between a diagonal portfolio rule and full Markowitz. At convergence, CRISP is Markowitz applied to a variance-preserving shrunk covariance-diagonal variances unchanged, off-diagonal correlations shrunk-with $\gamma$ tuned for out-of-sample Sharpe rather than covariance-estimation loss.
In Monte Carlo experiments across multiple covariance regimes and estimation ratios, HRP-$\mu$ and HRP-$\Sigma\mu$ both outperform plain HRP with HRP-$\Sigma\mu$ consistently improving on HRP-$\mu$. CRISP at intermediate $\gamma$ is the dominant method in both regimes, outperforming HRP, Cotton, Ledoit-Wolf shrinkage, direct Markowitz, and the signal-aware hierarchical methods.
We propose a Gaussian-copula-based framework that learns deal-level dependence directly from observed joint success frequencies across founder, geography, and market attributes. Holding marginal deal success probabilities fixed, deal-level correlation preserves expected portfolio outcomes but shifts the portfolio distribution toward heavier right tails and higher kurtosis. In portfolio simulations, correlation reduces the probability of modest success counts while sharply amplifying extreme upside outcomes, especially in structurally concentrated portfolios. Our findings suggest that extreme venture capital outcomes may partly reflect correlation-induced tail amplification rather than solely higher average deal quality, with potential implications for portfolio construction and risk management. We note that the observed dataset reflects selected deals with observable outcomes, which inflates apparent success rates relative to the true population base rate; however, the core finding that correlation reshapes the distributional shape while leaving the mean unchanged is structurally robust to the level of marginal success probabilities.
Electricity price forecasting supports decision-making in energy markets and asset operation. Probabilistic forecasts are increasingly adopted to explicitly quantify uncertainty, typically issued as quantile predictions or ensembles of the full predictive distribution. However, how improvements in statistical forecast quality translate into economic value remains unclear. Battery storage arbitrage in day-ahead markets is a popular application-based benchmark for this purpose. We analyze quantile-based trading strategies (QBTS) and identify two critical flaws: they do not incentivize honest probabilistic forecasting and they ignore the intertemporal dependence structure of electricity prices. We therefore frame battery optimization as a stochastic program based on fully probabilistic forecasts and examine decision quality measurement for risk-neutral and risk-averse settings under different uncertainty models. Our discussion touches both sides of the coin: How reliable is the economic evaluation of forecasting models though (simplified) application studies - and how do improvements in statistical forecast quality for stochastic programs relate to the decision-quality and economic performance? We provide theoretical justification and empirical evidence from a case study on the German electricity market. Our results highlight the pitfalls of ranking forecasting models through battery trading strategies. We conclude with implications for evaluation practice and directions for future research in application-based forecast assessment.
Text-based financial networks are increasingly used to study cross-stock return predictability. A common approach constructs links from similarities in firms' disclosure embeddings, but such networks often contain spurious edges because textual proximity does not necessarily imply economic connection. We propose a two-stage framework that first builds a sparse candidate graph from 10-K embeddings and then uses a large language model to classify and filter candidate edges according to their economic relations. The refined graph is used to aggregate pair-level mean-reversion signals into stock-level trading signals with relation-aware and distance-based weights. In a backtest on S&P 500 constituents from 2011 to 2019, LLM-based edge filtering improves the long-short Sharpe ratio from 0.742 to 0.820 and reduces maximum drawdown from $-$10.47% to $-$7.85%. These results suggest that LLM-based reasoning can improve the economic fidelity of text-derived financial networks and strengthen cross-stock predictability.
Institutional allocators often evaluate structured strategies on the basis of marketed backtests -- hypothetical track records constructed by applying a strategy's rules to historical data prior to any live trading, also referred to as pro-forma performance. It is unclear how much of that signal survives once the strategy is actually traded. Using 1,726 commercially distributed structured strategies from ten global institutions, this paper shows that raw pro-forma performance has only limited portability into the live period and weakens sharply once live outcomes are measured relative to peer and external benchmarks. The evidence indicates that marketed backtests predominantly reflect the common factor regime present before launch rather than strategy-specific skill. Strategies launched after unusually strong bucket-factor conditions experience materially worse subsequent deterioration. For allocators, the implication is practical: backtests should be judged relative to appropriate peer benchmarks, and the discount applied to them should increase when launch occurs after an extreme factor run.
We propose post-screening portfolio selection (PS$^2$), a two-step framework for high-dimensional mean--variance investing. First, assets are screened by Lasso-type regression of a constant on excess returns without an intercept. Second, portfolio weights are estimated on the selected set using standard low-dimensional methods. Because strong factors can destroy sparsity in real data, we further introduce PS$^2$ with factors (FPS$^2$), which defactors returns before screening and allows factor investing in the final step. We establish theoretical guarantees, and simulations and an empirical application show competitive performance, especially when sparse screening is appropriate or strong factors are explicitly accommodated.
We present the first portfolio-level validation of MarketSenseAI, a deployed multi-agent LLM equity system. All signals are generated live at each observation date, eliminating look-ahead bias. The system routes four specialist agents (News, Fundamentals, Dynamics, and Macro) through a synthesis agent that issues a monthly equity thesis and recommendation for each stock in its coverage universe, and we ask two questions: do its buy recommendations add value over both passive benchmarks and random selection, and what does the internal agent structure reveal about the source of the edge? On the S&P 500 cohort (19 months) the strong-buy equal-weight portfolio earns +2.18%/month against a passive equal-weight benchmark of +1.15% (approximating RSP), a +25.2% compound excess, and ranks at the 99.7th percentile of 10,000 Monte Carlo portfolios (p=0.003). The S&P 100 cohort (35 months) delivers a +30.5% compound excess over EQWL with consistent direction but formal significance not reached, limited by the small average selection of ~10 stocks per month. Non-negative least-squares projection of thesis embeddings onto agent embeddings reveals an adaptive-integration mechanism. Agent contributions rotate with market regime (Fundamentals leads on S&P 500, Macro on S&P 100, Dynamics acts as an episodic momentum signal) and this agent rotation moves in lockstep with both the sector composition of strong-buy selections and identifiable macro-calendar events, three independent views of the same underlying adaptation. The recommendation's cross-sectional Information Coefficient is statistically significant on S&P 500 (ICIR=+0.489, p=0.024). These results suggest that multi-agent LLM equity systems can identify sources of alpha beyond what classical factor models capture, and that the buy signal functions as an effective universe-filter that can sit upstream of any portfolio-construction process.
Sparsity or complexity? In modern high-dimensional asset pricing, these are often viewed as competing principles: richer feature spaces appear to favor complexity, while economic intuition has long favored parsimony. We show that this tension is misplaced. We distinguish capacity sparsity-the dimensionality of the candidate feature space-from factor sparsity-the parsimonious structure of priced risks-and argue that the two are complements: expanding capacity enables the discovery of factor sparsity. Revisiting the benchmark empirical design of Didisheim et al. (2025) and pushing it to higher complexity regimes, we show that nonlinear feature expansions combined with basis pursuit yield portfolios whose out-of-sample performance dominates ridgeless benchmarks beyond a critical complexity threshold. The evidence shows that the gains from complexity arise not from retaining more factors, but from enlarging the space from which a sparse structure of priced risks can be identified. The virtue of complexity in asset pricing operates through factor sparsity.
Topological Risk Parity keeps part of each signal at parent nodes because parent-child correlations are imperfect, enabling explicit sector-
abstractclick to expand
We develop \emph{Topological Risk Parity} (TRP), a tree-based portfolio construction approach intended for long/short, market neutral, factor-aware portfolios. The method is motivated by the dominance of passive/factor flows that naturally create a tree-like structure in markets. We introduce two implementation variants: (i) a rooted minimum-spanning-tree allocator, and (ii) a market/sector-anchored variant referred to here as \emph{Semi-Supervised TRP}, which imposes SPY as the root node and the 11 sector ETFs as the second layer. In both cases, the key object is a sparse rooted topology extracted from a correlation-distance graph, together with a propagation law that maps signed signals into portfolio weights.
Relative to classical Hierarchical Risk Parity (HRP), TRP is non-binary and designed for signed cross-sectional signals and hedged long-short portfolios: it preserves signal direction while using return-dependence geometry to shape exposures. It accounts for the fact that there is imperfect correlation between parent and child nodes, and thus does not propagate weights entirely to the children. We can also impose economically motivated hierarchy that involves industries, sub-industries or factors, etc. This makes it much more robust to macroeconomic shocks and crises, where within-cluster correlations might spike. These features make TRP well suited for market-neutral, equity stat-arb or L/S trend-type strategies, where enforcing neutrality or limiting exposures at the market, sector or factor level is extremely important.
We study a benchmarked risk-sensitive portfolio problem in a factor-based setting to bring together three strands of the literature: benchmarked risk-sensitive investment management, the Kuroda-Nagai change-of-measure method, and the free energy-entropy duality of Dai Pra et al. (1996). We show that the duality yields a direct solution of the benchmarked problem by reformulating it as a linear-quadratic-Gaussian stochastic differential game under a suitable equivalent probability measure, with an entropic regularization. The resulting value function is quadratic, the optimal controls are explicit affine feedback maps, and the optimal allocation admits two complementary interpretations: as a fractional Kelly strategy and as a Kelly portfolio adjusted via the entropic regularization. This formulation, therefore, contributes both a direct analytical route to the solution and a clearer interpretation of risk sensitivity, thereby embedding the classical Kuroda-Nagai change-of-measure approach within a more general framework. An added benefit of this formulation is that it is suitable for implementation via an RL algorithm. A simple implementation on U.S. equity data illustrates the tractability of the framework and numerically confirms the equivalence of the two approaches.
LLM-classified 24-hour jump causes show macroeconomic announcements price the strongest and most lasting premium, supporting a real-time re-
abstractclick to expand
In this paper, I present the first comprehensive, around-the-clock analysis of systematic jump risk by combining high-frequency market data with contemporaneous news narratives identified as the underlying causes of market jumps. These narratives are retrieved and classified using a state-of-the-art open-source reasoning LLM. Decomposing market risk into interpretable jump categories reveals significant heterogeneity in risk premia, with macroeconomic news commanding the largest and most persistent premium. Leveraging this insight, I construct an annually rebalanced real-time Fama-MacBeth factor-mimicking portfolio that isolates the most strongly priced jump risk, achieving a high out-of-sample Sharpe ratio and delivering significant alphas relative to standard factor models. The results highlight the value of around-the-clock analysis and LLM-based narrative understanding for identifying and managing priced risks in real time.
In overround markets the optimal bets stay fixed while allocations adjust to satisfy the drawdown limit.
abstractclick to expand
We study the finite mutually exclusive outcome version of risk-constrained Kelly optimization with explicit state prices. The market has outcome probabilities $p_i>0$, state prices $q_i>0$, terminal wealths $W_i=c+x_i/q_i$, and a drawdown-surrogate constraint \[ \sum_{i=1}^n p_i W_i^{-\lambda}\le 1,\qquad \lambda>0. \] For constant relative risk aversion utility, we work primarily in the standard overround regime $\sum_i q_i>1$, where every optimizer is necessarily non-full-support. Under the usual unique likelihood-ratio prefix hypothesis for the unconstrained problem, we prove that the constrained optimizer has exactly the same active set. Thus, in the regime where the prefix theorem is meaningful, the risk constraint deforms the funded wealth profile but does not change the active set. The support is therefore invariant across both the CRRA parameter and the drawdown-surrogate parameter.
We then isolate the logarithmic case $\gamma=1$. Once the common active prefix is known, the constrained problem reduces to a one-dimensional outer calibration together with independent one-dimensional inner equations on the active states. In this case we prove existence, uniqueness, and monotonicity for the inner solves, derive a complete calibration theorem, and record the resulting structured algorithm. We treat the fair and subfair regimes only as boundary cases: full-support phenomena can occur there, so the overround prefix theory no longer yields a parallel exact description of comparable sharpness. A numerical example illustrates how the risk constraint alters the funded wealth profile while leaving support unchanged.
Driven by the increasing frequency and intensity of natural disasters and chronic climate threats, we investigate the impact of physical climate risk on global equity portfolios. By employing a panel regression analysis on sectoral returns, we provide statistical evidence that extreme temperature events exert a negative effect on most sectors. We introduce two novel metrics based on these temperature anomalies, Climate Risk Exposure and Climate Exposure Volatility, in order to measure the environmental vulnerability of a portfolio. Unlike available static country-level indices, these metrics incorporate the time varying probability of extreme events and their relations with firm-specific asset intensity. We integrate these measures into a multi-objective portfolio optimization framework. This approach extends the traditional Mean-Variance paradigm, allowing investors to construct portfolios that are resilient to physical climate shocks without sacrificing diversification. Finally, we conduct a backtesting analysis to show the practical benefits of incorporating these climate risk metrics into the investment process, evaluating how climate-aware strategies perform relative to traditional benchmarks.
Kelly objective splits into fixed money and entropy terms plus variable divergence, turning portfolio selection into an information-compress
abstractclick to expand
In 1956 John Kelly wrote a paper at Bell Labs describing the relationship between gambling and Information Theory. What came to be known as the Kelly Criterion is both an objective and a closed-form solution to sizing wagers when odds and edge are known. Samuelson argued it was arbitrary and subjective, and successfully kept it out of mainstream economics. Luckily it lived on in computer science, mostly because of Tom Cover's work at Stanford. He showed that it is the uniquely optimal way to invest: it maximizes long-term wealth, minimizes the risk of ruin, and is competitively optimal in a game-theoretic sense, even over the short term.
One of Cover's most surprising contributions to portfolio theory was the universal portfolio. Related to universal compression in information theory, it performs asymptotically as well as the best constant-rebalanced portfolio in hindsight. I borrow a trick from that algorithm to show that Kelly's objective, even in the general form, factors the investing problem into three terms: a money term, an entropy term, and a divergence term. The only way to maximize growth is to minimize divergence which measures the difference between our distribution and the true distribution in bits. Investing is, fundamentally, a compression problem.
This decomposition also yields new practical results. Because the money and entropy terms are constant across strategies in a given backtest, the difference in log growth between two strategies measures their relative divergence in bits. I also introduce a winner fraction heuristic which allocates capital in proportion to each asset's probability of dominating the candidate set. The growth shortfall of this heuristic relative to the optimal portfolio is bounded by the entropy of the winner fraction distribution. To my knowledge, both the heuristic and the entropy bound are original contributions.
This paper develops a decomposition of standard Risk Contribution (RC) into two economically interpretable components: inherent risk and correlation risk. Using a leave-one-out representation, each position's RC separates into a term reflecting its own volatility contribution independent of the portfolio and a term capturing its covariance with the remainder of the portfolio. The inherent component is always positive, arising from the intrinsic volatility of the position, while the correlation component may amplify or mitigate total portfolio risk depending on how the position moves relative to other holdings. Because the decomposition operates within standard RC, it preserves the property of strict additivity. This separation provides diagnostic insight not visible from aggregate risk contributions alone. It distinguishes whether a position contributes risk because it is volatile in isolation or because it is highly correlated with the rest of the portfolio, and it clarifies when a negatively correlated position functions as an effective hedge. Two approaches to time-series analysis are presented to track how inherent and correlation risk evolve across market regimes, revealing whether changes in portfolio risk during stress periods are driven by volatility shocks, correlation shifts, or both. Empirical illustrations suggest that the decomposition provides stable, transparent, and easily implementable risk diagnostics that can support portfolio risk reporting, stress testing, and performance attribution.
On a 93-actor quarterly panel mixing macro indicators, institutional data, and firm-level investment ratios, global factor augmentation degrades prediction for actor subgroups whose dynamics are misrepresented by the shared basis. A two-stage architecture -- global pooled AR(1) for shared persistence, block-specific local models for residual dynamics -- improves full-panel out-of-sample $R^2$ from 0.630 to 0.677 ($\Delta = +0.047$, CI $[+0.036, +0.058]$, 10/10 windows, placebo $p \leq 0.001$). A held-out decade test (block partition frozen on 2005--2014 data, evaluated on unseen 2015--2024 windows) confirms the gain ($\Delta = +0.050$, 10/10), and a stratified placebo that fixes the macro/firm data-type split and permutes only firm-sector assignments corroborates ($z = 7.25$, $p \leq 0.001$). Cross-regime replication on a 109-actor UK/EU heterogeneous panel ($\Delta = +0.017$, 8/8 windows) and a combined US + UK/EU panel of 202 actors ($\Delta = +0.030$, placebo $z = 9.68$ -- exceeding the original US-only $z = 7.82$) confirms the architecture transfers across regimes. A 146-firm CapEx/Assets robustness check refines the scope condition: the gain depends on cross-sectional dispersion in autoregressive structure, which data-type heterogeneity reliably produces but which is also present in firm-only panels under suitable ratio choices.
MRP shows that high long-term Sharpe ratios do not guarantee resilience when market relationships weaken or alpha compresses.
abstractclick to expand
Systematic investment strategies are exposed to a subtle but pervasive vulnerability: the progressive erosion of their effectiveness as market regimes change. Traditional risk measures, designed to capture volatility or drawdowns, overlook this form of structural fragility. This article introduces a quantitative framework for assessing the durability of systematic strategies through minimum regime performance (MRP), defined as the lowest realized risk-adjusted return across distinct historical regimes. MRP serves as a lower bound on a strategy's robustness, capturing how performance deteriorates when underlying relationships weaken or competitive pressures compress alpha. Applied to a broad universe of established factor strategies, the measure reveals a consistent trade-off between efficiency and resilience -- strategies with higher long-term Sharpe ratios do not always exhibit higher MRPs. By translating the persistence of investment efficacy into a measurable quantity, the framework provides investors with a practical diagnostic for identifying and managing strategy-decay risk, a novel dimension of portfolio fragility that complements traditional measures of market and liquidity risk.
We study a discrete-time multi-period portfolio optimization problem under an explicit constraint on the Deviation Conditional Value-at-Risk (DCVaR), defined as the excess of Conditional Value-at-Risk over expected terminal wealth. The objective is to maximize expected return subject to a global tail-risk constraint, leading to a time-inconsistent precommitment problem. We propose a recurrent neural-network-based approach to approximate the optimal precommitment policy, which accommodates path-dependent risk constraints and highdimensional state dynamics without relying on dynamic programming. The explicit constraint formulation allows for exact penalty methods and provides a transparent notion of feasibility. The methodology is validated in a classical complete-market financial model and extended to a multi-period portfolio allocation problem in (re)insurance, capturing the long-term risk dynamics of insurance liabilities.
This paper studies an $\alpha$-robust utility maximization problem where an investor faces an intractable claim -- an exogenous contingent claim with known marginal distribution but unspecified dependence structure with financial market returns. The $\alpha$-robust criterion interpolates between worst-case ($\alpha=0$) and best-case ($\alpha=1$) evaluations, generalizing both extremes through a continuous ambiguity attitude parameter. For weighted exponential utilities, we establish via rearrangement inequalities and comonotonicity theory that the $\alpha$-robust risk measure is law-invariant, depending only on marginal distributions. This transforms the dynamic stochastic control problem into a concave static quantile optimization over a convex domain. We derive optimality conditions via calculus of variations and characterize the optimal quantile as the solution to a two-dimensional first-order ordinary differential equation system, which is a system of variational inequalities with mixed boundary conditions, enabling numerical solution. Our framework naturally accommodates additional risk constraints such as Value-at-Risk and Expected Shortfall. Numerical experiments reveal how ambiguity attitude, market conditions, and claim characteristics interact to shape optimal payoffs.
The limit of regime-switching approximations gives the first explicit optimal policy under the Cramér-Lundberg jump model.
abstractclick to expand
We consider an optimal dividend payout problem for an insurance company whose surplus follows the classical Cram\'er-Lundberg model. The dividend rate is subject to a ratcheting constraint (i.e., it must be nondecreasing over time), and the company may inject capital at a proportional cost to avoid ruin. This problem gives rise to a stochastic control problem with a self-path-dependent control constraint, costly capital injections, and jump-diffusion dynamics. The associated Hamilton-Jacobi-Bellman (HJB) equation is a partial integro-differential variational inequality featuring both a nonlocal integral term and a gradient constraint.
We develop a systematic probabilistic and PDE-based approach to solve this HJB equation. By discretizing the space of admissible dividend rates, we construct a sequence of approximating regime-switching systems of ordinary integro-differential equations. Through careful a priori estimates and a limiting argument, we prove the existence and uniqueness of a \emph{strong solution} in a suitable space. This regularity result is fundamental: it allows us to characterize the optimal dividend policy via a switching free boundary and to construct an explicit optimal feedback control strategy. To the best of our knowledge, this is the first complete solution -- comprising both the value function and an implementable optimal strategy -- for a dividend ratcheting problem with capital injection under the Cram\'er-Lundberg model. Our work advances the mathematical theory of optimal stochastic control beyond the standard viscosity solution framework, providing a rigorous foundation for dividend policy design in economics.
This paper proposes a machine learning assisted portfolio optimization framework designed for low data environments and regime uncertainty. We construct a teacher student learning pipeline in which a Conditional Value at Risk (CVaR) optimizer generates supervisory labels, and neural models (Bayesian and deterministic) are trained using both real and synthetically augmented data. The synthetic data is generated using a factor based model with t copula residuals, enabling training beyond the limited real sample of 104 labeled observations. We evaluate four student models under a structured experimental framework comprising (i) controlled synthetic experiments (3 x 5 seed grid), (ii) in-distribution real market evaluation (C2A) and (iii) cross-universe generalization (D2A). In real-market settings, models are deployed using a rolling evaluation protocol where a frozen pretrained model is periodically fine tuned on recent observations and reset to its base state, ensuring stability while allowing limited adaptation. Results show that student models can match or outperform the CVaR teacher in several settings, while achieving improved robustness under regime shifts and reduced turnover. These findings suggest that hybrid optimization learning approaches can enhance portfolio construction in data constrained environments
A proportional tax remains a uniform adjustment yet changes the stationary wealth distribution when investor abilities differ.
abstractclick to expand
We extend the Fokker-Planck framework of Froseth (2026, arXiv:2603.05283) to populations of investors with heterogeneous, persistent return-generating ability. When the drift coefficient in the Langevin equation for log-wealth varies across investors, the proportional wealth tax remains a uniform drift shift but ceases to be neutral in the economic sense: its real incidence differs across ability types, and the stationary wealth distribution changes shape. We derive the extended Fokker-Planck equation on the joint space of log-wealth and ability, characterise the conditions under which the drift-shift symmetry breaks, and identify the consequences for asset prices and portfolio allocations. The analysis connects the neutrality results of Froseth (2026, arXiv:2603.05264) and the Fokker-Planck dynamics of Froseth (2026, arXiv:2603.05283) to the heterogeneous-returns literature, notably the "use-it-or-lose-it" mechanism of Guvenen, Kambourov, Kuruscu, Ocampo-Diaz and Chen (2023).
After-tax excess returns become a uniform rescaling of pre-tax returns, with flow-tax and stock-tax distortions adding separately.
abstractclick to expand
A proportional wealth tax - a levy on the stock of wealth - preserves portfolio neutrality by acting as a uniform drift shift in the Fokker-Planck equation for wealth dynamics. We extend this result to the full system of ownership taxes (eierkostnader) that a shareholder faces: a corporate tax on gross profits, a capital income tax on the risk-free return, a dividend and capital gains tax on the excess return, and a wealth tax on net assets. Each tax modifies the drift of the wealth process in a distinct way - multiplicative rescaling, constant shift, or regime-dependent compression - while leaving the diffusion coefficient unchanged. We show that the combined system preserves portfolio neutrality under three conditions: (i) the capital income tax rate equals the corporate tax rate, (ii) the shielding rate equals the risk-free rate, and (iii) the wealth tax assessment is uniform across assets. When these conditions hold, the after-tax excess return is a uniform rescaling of the pre-tax excess return by the factor $(1-\tau_c)(1-\tau_d)$, and the drift-shift symmetry of the wealth-tax-only case generalises to a drift-shift-and-rescale symmetry. We classify the distortions that arise when each condition fails and show that flow-tax distortions and stock-tax distortions are additively separable: they do not interact. The shielding deduction - a feature of several real-world tax systems, including the Norwegian aksjonaermodellen - emerges as the mechanism that restores the symmetry between equity and debt taxation within this framework. Calibrated to the Norwegian dual income tax, conditions (i) and (ii) hold by institutional design; the only binding distortion is non-uniform wealth tax assessment, which generates portfolio tilts roughly 300 times larger than any residual flow-tax channel.