Transformers perform kernel-based prediction for Hölder regression on manifolds and achieve intrinsic-dimension-dependent minimax rates with sufficient training tasks.
hub
The in- trinsic dimension of images and its impact on learning
17 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
The subspace intervention framework reveals that pre-training objectives shape how ViTs encode geometric information in compressible low-rank subspaces, with peak precision at intermediate layers.
A game-theoretic model for multi-agent resource acquisition establishes BNE existence and computability under common priors, convergence conditions for learning dynamics, and simulations on financial data.
UR-JEPA applies uniform rectifiability regularization via a smoothed Carleson square function to JEPA training, producing embeddings with 4-5 order PCA spectral drop at dimension 20-25 and lower seed variance than Gaussian regularization on Inet10, Galaxy10, and EuroSAT.
CreFlow combines LTL compositional rewards with credit-aware NFT and corrective reflow losses in online RL to improve embodied video diffusion models, raising downstream task success by 23.8 percentage points on eight bimanual manipulation tasks.
A single-network implicit neural optimal transport method that solves the c-transform via proximal fixed-point iteration for stable, non-adversarial training.
Transformers achieve approximation and generalization error bounds for noisy manifold regression that scale with the intrinsic dimension of the task-level manifold.
For a broad class of coefficients, diffusion models achieve Õ(k/ε) iteration complexity for ε-accurate TV sampling under low-dimensional structure, independent of ambient dimension.
SAMG uses spatially adaptive guidance scales derived from a geometric analysis of classifier-free guidance to resolve the detail-artifact dilemma in diffusion-based image and video generation.
Reasoning in LLMs produces a transient geometric pulse in which concept manifolds untangle into linearly separable subspaces immediately before computation and compress afterward.
MoE Top-k routing equals the k-th elementary symmetric tropical polynomial, making sparsity combinatorial depth that scales capacity by binom(N,k) and gives MoE combinatorial resilience on manifolds.
Intrinsic dimension of quantum trajectories serves as an unsupervised probe sensitive to chaos, integrability, and ergodicity breaking in dissipative quantum systems.
BMTI estimates log-density via integration of neighbor differences on data manifolds using maximum-likelihood weighting, without binning or explicit coordinates.
ShuffleFlow is a variational inference framework that partitions images with pixel-unshuffling and models the joint posterior over sub-images using a shared conditional normalizing flow conditioned on neural field features for scalable Bayesian inverse imaging.
Reusing source latent spaces in diffusion models under distribution shift produces target score error set by principal-angle misalignment and diffusion-time-amplified ambient noise.
Benchmark across architectures and shift regimes finds OOD detector rankings shift with representation collapse; proposes NC-based shortlist predictor and PCA filter without extra OOD data.
Neural regression collapse occurs when last-layer feature intrinsic dimension falls below target intrinsic dimension, creating over-compressed and under-compressed regimes that govern generalization based on data quantity and noise.
citing papers explorer
-
Implicit Neural Optimal Transport via Fixed-Point Optimization
A single-network implicit neural optimal transport method that solves the c-transform via proximal fixed-point iteration for stable, non-adversarial training.