cs.NE — Pith

0

quant-ph 2026-07-03

Bounded gate stabilizes quantum fast-weight programmers

by Kuo-Chung Peng, Jiun-Cheng Jiang +9 more

Stable Self-Modulating Quantum Fast-Weight Programmers with Bounded Memory Gates

Tanh on old-state memory removes divergence in long sequences and improves robustness on forecasting tasks.

abstract click to expand

Quantum Fast-Weight Programmers (QFWPs) store temporal information in dynamically programmed variational-circuit parameters rather than in nonlinear recurrent hidden states, offering a practical route to quantum sequence modeling. Self-Modulating QFWP improves this framework by using input-dependent gates for both new fast-weight updates and the accumulated fast-weight state, but its unbounded old-state multiplier can diverge in long-sequence regimes. We propose a bounded old-state modulation rule that applies a sign-preserving tanh gate only to the recurrent memory branch while leaving the additive update and new-update modulation unchanged. We evaluate standard QFWP, full Self-Modulating QFWP, Only-New, and Only-Old variants on two CUDA-Q quantum-dynamics forecasting tasks and on Milan SMS telecommunication activity prediction. The quantum-dynamics results show that old-state modulation is the most consistent source of improvement over Standard QFWP, and that bounding the old-state gate removes long-sequence divergence while improving aggregate robustness. On Milan SMS forecasting, the original unbounded Self-Modulating QFWP converges across the tested grid and shows its clearest gains at longer input windows, with behavior close to the Only-Old ablation. These findings identify accumulated-memory modulation as the key mechanism of Self-Modulating QFWP and bounded old-state gating as a targeted stabilization strategy.

0

cs.NE 2026-07-03

Q-learning speeds genetic bin-packing solver by 50x

by Zitouni Rania, Mostefai Mounir Sofiane +4 more

Hybridizing a Grouping Metaheuristic with Reinforcement Learning for the One-Dimensional Bin Packing Problem

Dynamic operator selection matches HGGA quality while cutting runtime from 64 s to 1.3 s on standard benchmarks.

abstract click to expand

The one-dimensional bin packing problem (1D-BPP) is a canonical NP-hard combinatorial optimization problem with broad industrial applications. We propose RL-HGGA, a hybrid algorithm that integrates Falkenauer's Hybrid Grouping Genetic Algorithm (HGGA) with a tabular Q-learning controller. Rather than applying genetic operators at fixed probabilities, a Q-learning agent dynamically selects among eight macro-actions -- including BPCX crossover, light and heavy mutation, Martello-Toth local search, and population restart -- based on an eight-dimensional state representation encoding generation progress, stagnation level, optimality gap, average fitness, population variance, and average bin fill rate. The agent is trained with an epsilon-greedy policy over 400 episodes, with epsilon decaying to 0.05. Experiments on standard benchmark families (Falkenauer T/U, Scholl 1-3, Hard28) show that RL-HGGA achieves an average optimality gap of 0.95% -- competitive with HGGA (0.75%) and well below FFD (2.47%) -- while reducing mean computation time from 64.22 s to 1.29 s, a 50x speedup. These results demonstrate that learned adaptive operator selection can achieve near-HGGA solution quality at a fraction of the computational cost.

0

cs.NE 2026-07-03

Single dendritic compartment embeds LMS for in-context learning

by Juwei Shen, Yujie Wu +1 more

Dendritic In-Context Learning in a Single-Layer Spiking Neural Network

Subthreshold dynamics match leaky online Widrow-Hoff updates, allowing stable ICL in one layer without plasticity or depth.

abstract click to expand

In-context learning (ICL) operates via implicit gradient descent embedded in the forward pass of modern AI architectures -- Transformers, Mamba, state-space models, and MLPs. Capturing this capability in biologically plausible Spiking Neural Networks (SNNs) has remained an open challenge: existing SNNs fail the Garg-2022 benchmark at non-trivial task dimensions. We trace this failure to a structural assumption: prior SNN designs route adaptation through inference-time synaptic plasticity, viewing the dendritic compartment as a passive conduit for error or teacher signals. We challenge this assumption. The subthreshold dynamics of a single dendritic compartment already implement a complete online learning algorithm. By treating the compartment as the computational substrate rather than a passive conduit, we propose DendriCL -- a single-layer compartmental spiking architecture whose apical recurrence is structurally identical to leaky online Widrow-Hoff LMS. This dynamics-only update collapses the architectural depth required for general-purpose ICL to a single layer. DendriCL is uniquely seed-stable at super-dimensional Garg-2022 ICL -- where dense Transformers exhibit grokking-style instability and fail past moderate task dimension -- and a linear probe recovers the reference online-LMS trajectory directly from the apical membrane at R^2 = 0.93, showing the algorithm is structurally embedded in the dynamics rather than implicitly discovered during training. Taken together, ICL requires neither attention, depth, nor inference-time plasticity: a single compartment with online-LMS dynamics is sufficient.

0

cs.LG 2026-07-03

Stacking ensemble flags early Alzheimer's from ADNI records

by Debopriya Ghosh

Predicting Early Stages Of Alzheimer's Disease And Identifying Key Biomarkers Using Deep Artificial Neural Network And Ensemble Of Machine Learning Methodologies

After fixing missing values and imbalance, the model ranks biomarkers while comparing classifiers on standard accuracy measures.

abstract click to expand

Alzheimers disease (AD) is a brain disorder that develops slowly and mainly affects memory, thinking, language, and daily activities. It is one of the most common causes of dementia and creates many difficulties for patients as well as their families. In the early stage, the symptoms are often mild and may look like normal ageing. For this reason, many people are diagnosed late, when the disease has already progressed. At present, there is no complete cure for AD. Still, early detection can help doctors manage the condition better and take suitable steps at the right time. In this study, a machine learning model is proposed to detect the early stages of Alzheimers disease using clinical details, neuropsychological test scores, and neuroimaging-related measures. The data used in this work is collected from the Alzheimers Disease Neuroimaging Initiative (ADNI). As the dataset has missing values, iterative imputation is applied to fill them. The dataset also has class imbalance, which is handled using Borderline SVM-SMOTE. After that, feature selection is carried out using wrapper-based and embedded methods so that only important features are used for training. The selected features are divided into training and testing sets, and feature scaling is applied. A stacking ensemble model is developed using Logistic Regression, Extra Trees, Bagging KNN, and LightGBM as base classifiers. Along with this, an artificial neural network is also trained on the same dataset. The performance of these models is compared using precision, recall, F1-score, and AUC-ROC. This study aims to find the best classifier and also identify important biomarkers that may help in the early diagnosis of Alzheimers disease.

0

cs.NE 2026-07-03

Phase-locked loop yields simple bursting neuron circuit

by Lev V. Takaishvili, Vladimir I. Ponomarenko +2 more

Electronic Bursting Neuron: design, equations and hardware implementation

Adjusted equations produce hardware that matches demanded regimes and extends to small networks.

abstract click to expand

Electronic neurons are a keystone for construction of the spiking neural networks which have numerous applications in neuroprosthetics, artificial memory, intensive calculations etc. A number of concepts of electronic neurons has been already proposedm with some of them implemented in hardware. However, new schemes are of significant interest since the existing ones do not fit all requirements: either they are too complex and expensive in realization, or they are not able to demonstrate all demanded regimes, or their do not have a appropriate mathematical description and therefore may be investigated only experimentally etc. In this study we propose a new design of bursting electronic neuron constructed as a circuit implementation of the equations of a phase-locked loop system. To succeed, we use a novel hybrid approach: we start from the phenomenological equations providing the demanded, then we adjust and modify these equations to simplify the implementation rather than implementing the biophysical equations into thee hardware directly or writing equations for the already constructed circuit. The resulting circuit is simple in implementation and well matches the underlying equations. It can be used for description of not only a single neuron, but small neural circuits too.

0

cs.NE 2026-07-03

Evolving WFC inputs improves quality for local domains

by Dipika Rajesh, Ahmed Khalifa +1 more

Evolutionary Wave Function Collapse

Optimizing small examples yields better levels when properties arise from local relationships rather than global ones.

abstract click to expand

Wave Function Collapse (WFC) is a widely used procedural content generation method that learns local adjacency constraints from example inputs to generate larger outputs. In this paper, we explore combining WFC with evolutionary search by evolving the small input examples used by WFC rather than directly evolving complete levels. In this approach, WFC acts as a genotype-to-phenotype mapping. The generated levels are then evaluated through domain-specific fitness functions. We evaluate the method in two domains with different relationships between local and global structure: Maze connectivity maps and Zelda-style dungeon layouts. Our results show that evolutionary optimization over WFC inputs improves generation quality in domains where properties emerge from local relationships, while domains requiring global constraints remain challenging. These findings suggest that evolutionary search can effectively guide WFC generation when target objectives align with local structure.

0

cs.NE 2026-07-03

Metabolic optimizer loop stays bounded under mild resource rules

by Jinliang Xu, Liping Ma

Mechanism and Stability Analysis of Metabolic Closed-Loop Metaheuristics

Bounded gains and spending keep private energy and communal budget nonnegative and trigger three distinct internal regimes.

abstract click to expand

This paper studies the Metabolic Multi-Agent Optimizer (MMAO) at the framework level rather than at the implementation or benchmark level. The central question is whether the metabolic resource loop of private energy, communal budget, role drift, and lifecycle turnover has a framework-level interpretation beyond narrative metaphor. We introduce a generic MMAO state model that abstracts away domain-specific move operators while retaining the resource bookkeeping that defines the framework. Under mild bounded-gain and bounded-spending assumptions, we establish boundedness and nonnegativity properties for private energy, communal budget, role state, and active population size. We then characterize three endogenous behavioral regimes of the loop: contraction under sustained resource deficit, reinvestment under surplus communal accumulation, and search redistribution under heterogeneous marginal returns across agents or subgroups. The analysis is intentionally conservative. It does not claim global convergence of the full adaptive system, universal superiority over specialist optimizers, or a complete stationary characterization of the resulting process. Instead, it identifies which internal regulation properties are generic consequences of the loop and which remain implementation specific. A compact mechanism-validation package on representative continuous and discrete MMAO realizations provides supporting empirical evidence for this reading, but is not intended to replace a full benchmark study. The resulting contribution is therefore a bounded, regenerative, resource-regulated interpretation of MMAO, rather than a complete proof of all adaptive behaviors of the full algorithm family.

0

cs.NE 2026-07-02

Metabolic optimizer jointly selects features and tunes classifiers

by Jinliang Xu, Liping Ma

MMAO-Cls: Metabolic Multi-Agent Optimization for Joint Feature Selection and Classifier Tuning

MMAO-Cls reaches competitive test scores while returning the smallest average feature subsets on seven tabular benchmarks.

abstract click to expand

This paper studies whether the Metabolic Multi-Agent Optimizer (MMAO) can act as a credible outer-loop optimizer for classification model selection. We propose MMAO-Cls, a mixed-space realization in which each agent jointly encodes a binary feature mask and classifier hyperparameters, while private energy, communal budget, role drift, and lifecycle turnover are mapped to the accuracy-complexity tradeoff of wrapper learning. The implementation is strengthened by deriving feature-budget adaptation from feature-information priors and by regularizing validation reward with both subset compactness and train-validation overfitting gap. We evaluate MMAO-Cls on seven standard tabular benchmarks with three seeds each and compare it against RandomSearch, GA-lite, PSO-lite, and an endogenous no-sharing ablation. On the aggregate validation objective, MMAO-Cls ranks second ($0.9433$) behind GA-lite ($0.9446$). On held-out test performance, it reaches mean score $0.8882$, improving over RandomSearch ($0.8808$) and GA-lite ($0.8857$), remaining close to PSO-lite ($0.8874$) and the no-sharing ablation ($0.8900$), while using the most compact mean held-out feature subset among all compared methods (feature ratio $0.4881$). Pairwise tests show that these margins are not yet statistically significant. The resulting claim is therefore conservative: MMAO-Cls supports classification applicability and compact mixed-space search more clearly than it isolates communal sharing as a decisive standalone advantage.

0

cs.NE 2026-07-02

Simple random mutations find self-replicators as easily as paired interactions

by Charlotte Knierim, Luca Versari +3 more

BFF: Simple explanations for complex phenomena

Ancestry limits block takeover but not first appearance in program-space simulations.

abstract click to expand

The ''Computational Life'' paper (Ag\"uera y Arcas et al., 2024) argues that paired interactions in a computational soup are an effective way to find self-replicators. In this work, aided by recent developments in self-replicator detection, we explore the alternate hypothesis that self-replicators can be found at least as easily using simple mutation random walks in program space. We also explore the claim that capping the maximum ''depth'' and ''width'' of the ancestry tree stops self-replicators from emerging, showing instead that it merely stops self-replicators from taking over the soup.

0

cs.NE 2026-07-02

Metabolic optimizer reaches 28.07 mean offline error in dynamic tests

by Jinliang Xu, Liping Ma

MMAO-Dyn: A Metabolic Multi-Agent Optimizer for Dynamic Optimization

Mapping energy and role controls to changing environments improves recovery over generic and other dynamic methods.

abstract click to expand

This paper studies whether the Metabolic Multi-Agent Optimizer (MMAO) can be credibly derived into a dynamic-optimization method without replacing its core metabolic control loop by external adaptation modules. The proposed MMAO-Dyn maps private energy, communal budget, role drift, success feedback, and lifecycle turnover to a nonstationary setting in which environmental changes repeatedly invalidate previously useful local structure. We evaluate MMAO-Dyn on an 18-scenario synthetic dynamic continuous benchmark matrix covering shifted sphere, shifted Ackley, and shifted Rastrigin landscapes at $10D$, $20D$, and $30D$, with two change severities and 12 seeds per scenario. The comparison layer includes a generic MMAO variant without dynamic derivation, dynamic random search, dynamic PSO-lite, dynamic DE-lite, and three endogenous ablations. Across the full 216-run matrix, MMAO-Dyn attains mean offline error $28.07$, improving over Generic-MMAO ($29.36$), Dynamic-PSO-lite ($34.65$), Dynamic-DE-lite ($67.09$), and Dynamic-RandomSearch ($111.37$). The gains are clearest in aggregate robustness on sphere and Rastrigin families and in 10-step post-change recovery relative to the generic backbone, whereas the seed-aligned comparison with Dynamic-PSO-lite remains unfavorable in win-loss count and the \texttt{NoMemoryRefresh} ablation stays very close to the full method. We therefore position MMAO-Dyn as a credible family-expansion result for MMAO: the metabolic loop can generate meaningful dynamic behavior, but the strongest current value lies in recovery-oriented resource redistribution rather than in universal dominance or in a fully optimized submechanism design.

0

cs.NE 2026-07-02

MFEA-CoD finds diverse novel solutions across tasks with less redundancy

by Jiao Liu, Yanchi Li +3 more

From Consistency to Collaborative Discovery: MFEA-CoD for Multitask Novelty Search

Repulsion between tasks and adaptive sharing of discoveries improve efficiency especially when objectives deceive standard search.

abstract click to expand

Evolutionary multitasking (EMT) has shown strong capability in solving multiple optimization problems simultaneously by exploiting latent inter-task consistency, such as similarities in promising solutions or search directions. However, most existing EMT studies remain focused on objective-driven optimization, where such consistency is mainly used to accelerate convergence toward predefined optima. In this paper, we move EMT from consistency to collaborative discovery and propose a multifactorial evolutionary algorithm with collaborative discovery (MFEA-CoD) for multitask novelty search. Unlike conventional EMT, MFEA-CoD coordinates multiple novelty search tasks to collaboratively discover behaviorally novel solutions rather than merely transferring consistent search information for faster convergence. Specifically, a multitask repulsion operator encourages different tasks to explore distinct regions of the unified search space, thereby reducing redundant behavioral discoveries. Meanwhile, an adaptive inter-task transfer mechanism exploits shared discovery opportunities in overlapping novelty-improving regions by adjusting the transfer probability according to the online contribution of transferred information. Furthermore, MFEA-CoD is extended to multitask novelty-augmented optimization, where behavioral novelty is jointly considered with objective information to alleviate premature convergence caused by deceptive objectives. Experiments on synthetic basin-type problems, deceptive maze navigation problems, MuJoCo policy optimization problems, and generative novelty search problems demonstrate that MFEA-CoD improves the efficiency of discovering diverse novel solutions and shows clear advantages in deceptive objective landscapes.

0

cs.NE 2026-07-02

Memristive signed weights sustain anti-phase attractors autonomously

by Riley Acker, Aman Desai +2 more

Self-Organized Learning in Oscillatory Neural Networks with Memristive Signed Couplings

Circuit simulations show negative effective weights let phase-coded memories persist after training for denoising tasks.

abstract click to expand

Oscillatory neural networks (ONNs) have emerged as a promising neuromorphic architecture, leveraging coupled dynamical systems to perform computation and represent information through phase relationships. Their interactions can be designed to support intrinsic energy-minimizing dynamics, enabling tasks such as associative memory and optimization, and positioning them as a candidate architecture for continuous learning and inference. We present a neuromorphic primitive implemented using memristive edges with inhibitory couplings as a potential design for autonomous learning, and provide circuit simulation validation that the system is capable of denoising noisy inputs on an auto-associative task. While numerical Hopfield/Ising models routinely assume signed weights, neuromorphic implementations of ONNs often fail to realize negative weights due to device and circuit constraints. A practically implementable route to inhibitory (negative) weights is particularly valuable: it expands the class of attractor structures accessible to oscillator networks beyond purely synchronous couplings, and supports phase-coded memories where anti-phase constraints are not merely transiently enforced during training but can persist autonomously after release. We provide circuit simulations and theoretical analyses demonstrating that signed effective weights are necessary for anti-phase attractors to persist autonomously.

0

cs.LG 2026-07-01

Evolved Transformers match or beat baseline on time series tasks

by AbdElRahman ElSaid, Damir Pulatov

EVOTS: Evolutionary Transformer Search for Time Series Forecasting

Modular genome and repair mechanism let search discover task-adaptive models competitive on ETT multivariate benchmarks.

abstract click to expand

Evolutionary neural architecture design for multivariate time-series forecasting remains underexplored, with most approaches relying on fixed Transformer architectures despite substantial variation across tasks and forecasting settings. This paper introduces an evolutionary neural architecture search framework for discovering task-adaptive Transformer-like models for time-series forecasting (EVOTS). Architectures are encoded using a modular genome representation that enables flexible composition of attention, feed-forward, and projection components, while a repair mechanism enforces structural validity throughout the evolutionary process. This formulation allows effective exploration of a diverse architecture space without relying on hand-crafted design rules. The proposed approach is evaluated on four benchmark datasets from the ETT family (ETTh1, ETTh2, ETTm1, and ETTm2) under multiple forecasting settings, including univariate-to-univariate, multivariate-to-univariate, and multivariate-to-multivariate prediction, with horizons of 96, 192, 336, and 720. In the multivariate-to-multivariate setting, the evolved architectures achieve competitive and, in several cases, improved mean squared error relative to a strong Transformer-based baseline. Additional analyses examine performance differences across forecasting settings and report wall-clock training time to provide a coarse indication of computational cost. Overall, the results demonstrate that evolutionary search can effectively discover flexible and high-performing Transformer-like architectures for multivariate time-series forecasting within practical runtime constraints.

0

cs.NE 2026-07-01

GP symbolic regression results independent of population start method

by Lukas Kammerer, Gabriel Kronberger +4 more

Evaluation of Population Initialization Methods for Genetic Programming-based Symbolic Regression

Tests show random and optimized initializations produce equivalent accuracy-complexity fronts after a few generations on synthetic and real

abstract click to expand

We analyze the effect of optimizing the initial population of genetic programming (GP) for symbolic regression (SR) on the accuracy and complexity of solutions. We compare three well-established random initialization methods as well as initialization with small optimized solutions from exhaustive symbolic regression (ESR) using a GP/SR implementation which is based on the multi-objective evolutionary algorithm NSGA-II. We compare the final Pareto fronts found with each initialization method on twelve synthetic problems of varying complexity and one real-world dataset. We find no significant differences in accuracy or model complexity among the initialization methods. The initial advantage of initialization with ESR disappears after only a few generations. Our results show that, given similar diversity in the initial population, the effect of the initialization method in GP-based symbolic regression on the final Pareto front is negligible.

0

cs.NE 2026-07-01

D-HTM issues warnings 8 samples before local anomalies

by Pavia Bera, Jennifer Adorno +1 more

Distributed Hierarchical Temporal Memory with Shared Associative Memory for Cross-Entity Preemptive Warning

Shared memory reuses precursor patterns across entities in a common sparse representation while preserving online learning.

abstract click to expand

Anomaly detection in multivariate time series remains a critical challenge in large-scale distributed systems, where related entities may exhibit transferable precursor behavior prior to anomaly onset. Existing methods typically operate independently on each data stream and therefore remain fundamentally reactive. To address this limitation, we introduce Distributed Hierarchical Temporal Memory (D-HTM), a neuromorphic framework that enables cross-entity preemptive warning through a Shared Associative Memory (SAM). D-HTM combines a Spatial Pooler (SP) that projects observations into a common Sparse Distributed Representation (SDR) space, Temporal Memory (TM) modules that learn entity-specific dynamics online, and a Shared Associative Memory that stores recurring pre-anomaly signatures. By reusing precursor knowledge across related entities, D-HTM can issue warnings prior to local anomaly onset while preserving HTM's online learning capabilities. We evaluate D-HTM on the Server Machine Dataset (SMD), the Soil Moisture Active Passive (SMAP) dataset, the Mars Science Laboratory (MSL) dataset, and a synthetic cascade benchmark designed to isolate precursor transfer. Experimental results demonstrate effective cross-entity warning propagation while maintaining competitive reactive anomaly detection performance. Across the real-world datasets, D-HTM provides an average warning lead time of 8.1 samples prior to anomaly onset. These findings demonstrate that transferable precursor structure can emerge within a shared SDR space and be reused for preemptive warning generation, extending HTM beyond isolated reactive detection toward distributed predictive reasoning.

0

cs.LG 2026-07-01

Dual-stream networks learn representations under Dale's principle

by Yutaro Yamada, Luca Grillotti +4 more

Diffusing Blame: Task-Dependent Credit Assignment in Biologically Plausible Dual-Stream Networks

Modulo error routing reaches 96.7% on MNIST and 61.7% on CIFAR-10 without backpropagation.

abstract click to expand

Biological neural circuits obey Dale's principle: each neuron's synapses are uniformly excitatory or inhibitory. Artificial networks that respect this constraint must coordinate separate excitatory and inhibitory populations, fundamentally changing how credit is assigned during learning. Several biologically plausible learning rules avoid backpropagation's weight transport requirement, but it has been difficult to achieve strong performance under Dale's principle beyond MNIST. Error Diffusion (ED) was originally proposed in a dual-stream excitatory/inhibitory architecture, where learning is driven by routing global error signals to all layers without transporting transposed forward weights or relying on random feedback matrices. Whether such a rule can scale under Dale's principle across both supervised classification and reinforcement learning remains unknown. Here, we introduce modulo error routing to extend Error Diffusion beyond binary classification, and show that a dual-stream excitatory/inhibitory architecture trained with this method achieves 96.7% on MNIST and establishes a 61.7% baseline on CIFAR-10, demonstrating that representation learning is possible even when strictly enforcing Dale's principle. For the classification setting, we introduce three domain-specific innovations: layer-specific sigmoid widths, batch-centered class error signals, and asymmetric initialization, and ablation analysis reveals that their relative importance reverses between MNIST and CIFAR-10, exposing task-dependent credit-assignment bottlenecks invisible to single-benchmark evaluation. In reinforcement learning, we integrate ED with Proximal Policy Optimization (PPO) and evaluate it on continuous-control tasks in Google Brax and on Craftax, an open-ended exploration task. We show that ED-PPO achieves competitive performance relative to Direct Feedback Alignment, a backpropagation-free baseline.

0

cs.NE 2026-07-01

MMAO outperforms baselines on CEC2017 and TSPLIB benchmarks

by Jinliang Xu, Liping Ma

A Large-Scale Empirical Evaluation of MMAO Under Fair-Budget Continuous and Discrete Benchmarks

Large-scale controlled-budget tests show the closed-loop allocator exceeds PSO-lite, ES-lite and 2-opt on standard continuous and routing pr

abstract click to expand

This paper evaluates the Metabolic Multi-Agent Optimizer (MMAO) under a stricter empirical protocol rather than reintroducing the framework itself. The study asks whether MMAO's closed-loop resource-allocation principle remains credible under broader, more standard, and more explicitly budget-controlled continuous and discrete benchmarks. The main completed matrix covers eight CEC2017 functions at 10D and 30D with 20 seeds each, and five TSPLIB instances with 20 seeds each, together with stronger reproducible baselines including PSO-lite, ES-lite, and an iterated-greedy 2-opt route baseline. We further add trajectory-level diagnostics for communal budget, success rate, role evolution, and population turnover, plus an auxiliary OR-Library multiple-knapsack slice to extend the discrete evidence beyond routing. Under this protocol, MMAO clearly outperforms the external baseline set on the continuous side and on the TSPLIB side, while the ablation variants remain much closer to the full method than the external baselines are. We therefore position MMAO as a benchmark-backed cross-domain adaptive framework whose most clearly validated value is endogenous resource redistribution under evidence pressure, while also noting that the strongest remaining gap is not basic workability but sharper mechanism isolation and broader competition-grade comparison.

0

cs.LG 2026-07-01

Black-box measure bounds neural net error under input noise

by Mark Levene, Martyn Harris

Robustness of neural networks to random noise perturbations of their inputs

A simple statistic gives high-probability upper limits on mean squared error for perturbed inputs, tested on real datasets.

abstract click to expand

We investigate the problem of the robustness of a trained neural network to the perturbation of its input values. More specifically, we examine the interplay between the accuracy of the network, as measured by the mean squared error, and robustness. Accordingly, we present a robustness measure, which, with high probability, suggests an upper bound on the mean squared error of the network, with respect to an input data set, for a given perturbation of the input values of the network. The measure we propose is both simple and efficient to compute, treating the neural network as a black box. We provide experimental results on several real-world data sets showing the efficacy of the proposed method. We also introduce the concept of robustness curves, which allows us to further analyse robustness within and between data sets.

0

cs.LG 2026-06-30

PGDS labels decision variables as drivers or blockers in 10-objective problems

by Cláudio Lúcio Do Val Lopes, Flávio Vinícius Cruzeiro Martins +1 more

Partition-Guided Distance Saliency: Bridging Decision and Objective Spaces in Many-Objective Optimization

Surrogate distance mapping plus automatic partitioning yields actionable explanations where visualization fails.

abstract click to expand

Explainability in Many-Objective Optimization (MaO) is currently hindered by the escalating complexity of the Pareto front, which renders the relationship between high-dimensional decision variables and objective outcomes increasingly opaque. As the number of objectives exceeds the limits of traditional visualization, decision-makers encounter a ``cognitive drought'' in identifying relevant trade-offs or specifying target regions without a priori knowledge. To bridge this interpretability gap, we introduce the {Partition-Guided Distance Saliency (PGDS)} framework, a novel XAI approach designed for continuous optimization landscapes. Our framework automates the explanation process through a three-stage pipeline that prioritizes geometric intuition over abstract rules. First, we employ a surrogate model that learns how geometric distances in the decision space map to proximity in the objective space. Second, to address the difficulty of manual target selection in high dimensions, the framework automatically partitions the objective landscape into distinct regions and identifies local ``Dominating Points'' to serve as automated targets for improvement. Third, we quantify how sensitive a solution's position is to each decision variable by measuring the distance shifts induced by perturbations to each variable. This allows PGDS to categorize features as either ``Drivers'' which facilitate convergence toward preferred regions, or ``Blockers'' which represent geometric constraints hindering further progress. Validation on 10-objective benchmarks and a physics-informed engineering problem (Welded Beam) demonstrates that PGDS provides differentiated, actionable insights that traditional visualization and rule-based XAI methods fail to provide.

0

cond-mat.stat-mech 2026-06-30

Genetic algorithm equals clipped gradient descent with Hessian-controlled noise

by Stephen Whitelam

Why can genetic algorithms work in high-dimensional search spaces?

Transverse fluctuations depend on effective rank of the loss Hessian, not parameter count, allowing scaling in high dimensions.

abstract click to expand

We show that the effective dynamics of the elitist $(1+M)$ genetic algorithm is, in the limit of small mutations, clipped gradient descent on the loss in the presence of anisotropic Gaussian white noise. In expectation, therefore, a simple mutation-selection genetic algorithm follows the gradient of the loss, without explicit calculation of gradients and without averaging over loss evaluations. The genetic algorithm is slower than gradient descent because of the noise that acts in directions transverse to the gradient. However, this slowdown is controlled not by the number of parameters of the search space but by the effective rank of the Hessian of the loss function. For the concentrated Hessian spectra observed in neural-network loss functions the effective rank can be far smaller than the number of parameters, which may explain why genetic algorithms can scale to large search spaces.

0

cs.CG 2026-06-30

Box decompositions compute integral R2 in O(n log n) for 2-3 objectives

by Michael T. M. Emmerich

Computing the Integral R2 Indicator by Perspective Mapping and Box Decomposition

Perspective mapping equates the indicator to weighted hypervolume differences, letting existing algorithms apply after a density substitutio

abstract click to expand

The continuous integral R2 indicator is a Pareto-compliant refinement of the classical finite-weight-vector R2 indicator, used in performance assessment, bounded archiving for a-posteriori multi-objective optimization, and skyline selection in databases. This work introduces a bidirectional perspective mapping between continuous integral R2 computation and integration over unions of anchored axis-aligned boxes. After translating the ideal point of a minimization problem to the origin, approximation points become strictly positive loss vectors, and the subgraph of the lower weighted Tchebycheff envelope over the weight simplex maps to the complement of an anchored-box union in reciprocal objective space. The Jacobian gives an absolute R2 formula as a weighted complement volume with density $(x_1+\cdots+x_N)^{-(N+1)}$, while differences of R2 values become finite weighted hypervolume differences. Hence, hypervolume algorithms that emit box decompositions can be reused by replacing ordinary box volumes with closed-form weighted box integrals. For $N$ objectives, this gives an output-sensitive overhead $O(2^N M)$ for an $M$-box decomposition, or $O(M)$ for fixed $N$. Using existing box-decomposition approaches, the integral R2 can be computed in $O(n \log n)$ for $N=2,3$, in $O(n^2)$ for $N=4$, and in $O\left(n^{\lfloor (N-1)/2\rfloor+1}\right)$ for $N\geq4$, with $n$ denoting the size of the approximation set. On the lower-bound side, exact value computation has an $\Omega(n\log n)$ lower bound in the algebraic decision-tree model already in two objectives, this bound lifts to every fixed $N\geq2$, and exact computation is $\#P$-hard when $N$ is part of the input. Together, the proposed perspective mapping provides a powerful tool for transferring algorithmic and structural results between anchored-box union and hypervolume theory and integral R2 computation.

0

cs.NE 2026-06-30

Metabolic loop drives adaptive metaheuristic search across domains

by Jinliang Xu, Liping Ma

Minimal MMAO: A Resource-Closed-Loop Framework for Adaptive Metaheuristic Search

Private energy and communal budget replace external schedules for intensity and balance control.

abstract click to expand

This paper presents the Metabolic Multi-Agent Optimizer (MMAO) as an adaptive metaheuristic built around endogenous resource circulation. The central premise is that search intensity, exploration--exploitation balance, and lifecycle turnover should be induced by a shared metabolic controller rather than by separately attached schedules. We formulate MMAO through bounded private energy, a communal budget, normalized reward, continuous role adaptation, and resource-financed branching and pruning. The method is then instantiated in both continuous and discrete domains and evaluated on a matched small-scale suite including Sphere, Rastrigin, a synthetic Euclidean TSP, and two TSPLIB instances. The results show a consistent pattern: the same metabolic loop remains workable across domains, the discrete realization remains relatively stable under a compact design, and continuous refinement quality is the main cost of keeping the method lean. Taken together, these findings position MMAO as a coherent framework for adaptive heuristic design rather than a loose collection of operators.

0

cs.AI 2026-06-30

Slow self-credit builds durable behavioral residue after buffer removal

by Haoliang Han

From Detecting Agency to Doing Work: Self-Caused Credit Builds a Durable Behavioral Self in a Minimal Spiking Agent

Agency-gated updates on own actions create residue that persists without episodic memory, while agency detection alone produces none.

abstract click to expand

How does an agent that can tell self from world come to be durably shaped by that distinction? Recent work shows that a predictive system can detect its own agency (Ye, 2026), but detecting agency does not explain durable, self-shaped behavior. We show that agency-gated slow credit -- a conjunctive term Own*Agency*Salience driving a slow parameter update -- produces post-unload behavioral residue: on a spiking substrate (Nengo LIF/PES), a learned self-preserving choice survives episodic buffer removal (retained fraction 0.96, N=50) and collapses when the slow decoders are reset or the agency gate is removed. Reproducing the agency comparator and toggling only the slow-credit channel, we find a clean dissociation: at matched agency gain, durable behavior develops only when self-credit performs slow work (post-unload self-preservation 1.00 vs 0.00). The same dissociation holds in 24-dimensional partially-observed control (0.74 vs 0.00), and a plastic-work analysis shows that basin deformation equals net self-credit work. Across eight sequentially-learned tasks under exogenous interference, the multiplicative veto also prevents forgetting: it retains old tasks (final post-unload accuracy 0.88, forgetting 0.13) where additive pooling collapses to chance-level recall, the no-agency ablation falls below chance, and episodic/replay baselines stay near chance after unload -- all with no replay buffer and no task-boundary-dependent protection mechanism (N=50). We formalize the durable residue as an operational behavioral self and argue that self-caused credit doing slow work is a necessary building block for agents that develop a self. No claim of consciousness is made.

0

cs.NE 2026-06-30

STABLE co-evolves configs and components using semantic models

by Zhiyao Zhang, Shenghao Wu +2 more

Semantics-Aware Bilevel Co-Evolution: Towards Automated Multicomponent Algorithm Design

Bilevel evolution with semantics guidance yields better multicomponent algorithm designs than baselines.

abstract click to expand

LLM-assisted evolutionary search (LES) has emerged as a promising paradigm for automated algorithm design. However, existing methods usually suffer from two inherent limitations when facing the automated design of real-world complex algorithms that usually consist of multiple components. The first limitation is that they either focus on modifying entire algorithms, making it difficult to reuse high-quality components, or concentrate on component refinement within a limited set of predefined multicomponent configurations. The second limitation is the insufficient explicit modeling and exploitation of algorithm semantics. These limitations severely degrade search efficiency and hinder effective exploration of complex design spaces. Therefore, this paper proposes STABLE (Semantics-Aware Bilevel Co-Evolution), an LES method purpose-built for automated multicomponent algorithm design that introduces structural algorithm formulation and semantics-driven evolution. In STABLE, complex algorithms are organized into hierarchical and modular architectures rooted in domain knowledge, aligning the search space with their intrinsic compositional traits. Based on this structured algorithm formulation, STABLE simultaneously optimizes high-level multicomponent configurations and low-level functional components, enabling coordinated cross-level updates while maintaining suitable granularities for design space exploration. At each level, STABLE establishes a multi-faceted semantic model to assist LLMs in capturing structural correlations, functional compatibilities, and inherent rationalities among algorithm components. This semantic model serves as the core guidance for evolutionary search, enabling principled algorithm generation and algorithm evaluation. Extensive experiments demonstrate that STABLE outperform both human-designed baselines and those from advanced LES methods.

0

cs.NE 2026-06-30

Evolution strategy finds much smaller CNNs for steering prediction

by Devson Butani, Ryan Kaddis +1 more

Evolutionary Hyperparameter Optimization to Find Lightweight CNN Models for Autonomous Steering

Search on small driving-image set yields compact models whose accuracy stays close to the baseline.

abstract click to expand

This research investigates the optimization of Convolutional and Dense Neural Networks (CNNs and DNNs) for autonomous steering using the (N+M) Evolution Strategy (ES) with the 1/5th success rule. The primary objective is to develop a lightweight CNN based model capable of real-time steering angle prediction, mimicking human driving behavior on predefined paths. The ES algorithm automates hyperparameter tuning, dynamically adjusting parameters such as filter sizes and layer configurations. Data collection encompasses driving scenarios recorded via the LTU ACTor autonomous driving platform, including variations in path direction and driving style. The very small dataset consists of timestamped images labeled with steering angles and pre-processed to focus on relevant visual information. Initial experiments involve training a baseline CNN model, which is then refined using ES to significantly reduce the size of the model while maintaining competitive predictive accuracy. The results highlight the viability of lightweight neural network architectures for real-time autonomous systems, striking a balance between computational efficiency and performance. This study not only advances research initiatives on the use of evolutionary algorithms for autonomous driving applications but also lays the foundation for the deployment of cost-effective and scalable solutions in self-driving technology.

0

q-bio.NC 2026-06-29

Geometric stability of neural codes tracks behavior apart from drift

by Prashant C. Raju

Geometric Stability of Neural Population Codes: Regional Variation, Behavioral Relevance, and Circuit Dependence

Split-half distance consistency predicts trial-by-trial neural-behavioral coupling while centroid measures show none

abstract click to expand

Current models of representational reliability in neural populations focus on temporal stability: whether population centroids are preserved across sessions and days. This framing leaves a fundamental question unanswered: how reliably does the pairwise distance structure among stimuli reproduce across independent observations within a session? We argue that this property, geometric stability, constitutes an independent axis of representational analysis that existing frameworks do not capture. We formalize geometric stability as the Spearman rank correlation between split-half representational dissimilarity matrices (Shesha) and show that it is empirically dissociable from both temporal stability and decoding accuracy. Across 229 area-session observations spanning 68 brain regions in a visual discrimination task (Steinmetz et al. 2019), geometric stability predicts trial-by-trial neural-behavioral coupling ($\rho = 0.18$, $p = 0.005$) while centroid drift does not ($\rho = 0.002$, $p = 0.976$). The regional hierarchy, with striatum most stable ($\bar{S} = 0.44$) and hippocampus least ($\bar{S} = 0.19$), runs roughly opposite to the temporal stability hierarchy. Directionally consistent olfactory data (Bolding \& Franks 2018) motivate an attractor network model in which recurrent excitatory coupling amplifies split-half RDM consistency by completing stimulus patterns from sparse feedforward input ($\rho = +0.64$, $p = 0.010$), providing a circuit-level account of how geometric stability emerges. These results establish geometric stability as a functionally relevant, circuit-dependent property of neural population codes, orthogonal to temporal drift measures and complementary to recent accounts of how recurrent connectivity balances representational stability with sequential dynamics in hippocampal circuits.

0

cs.NE 2026-06-29

Opposing waves train deep nets with local Hebbian rules

by Andreas Knoblauch

Supervised Hebbian learning in Deep Counterstream Associative Networks

Simultaneous activity waves from input and output meet in hidden layers to link patterns bidirectionally, matching backprop accuracy on bina

abstract click to expand

Modern machine learning applications employ deep neural networks training with the error backpropagation algorithm. Although this algorithm is very effective, it lacks biological realism. For example, backpropagation requires symmetric connectivity, and a separate neural processing channel for error signals. Prior works have therefore proposed a number of more realistic alternatives for error backpropagation. However, most of them still suffer from demanding preassumptions that may be not fulfilled in the real brain, for example, they often still require either symmetric connectivity or two separate processing channels, and often require also special mathematical operations like subtractions or function inversions. Here I propose supervised counterstream learning in deep associative networks as a simpler approach that requires only recognition of errors during training, and then backpropagates correcting target activity through the same activity channel as used for forward propagation. For this, two activity waves are initiated at the same time in input and output layers and then traveling in opposite directions to meet in one of the hidden layers. By employing simple local Hebbian-type learning rules, the corresponding activity pattern sequences get linked bidirectionally, thereby decreasing error rates over time. Despite its simplicity and an incomplete hyperparameter optimzation, a high high test accuracy is achieved on the (binarized) MNIST data set that is comparable to more demanding architectures.

0

cs.AI 2026-06-29

Agents invent languages cutting LLM reasoning tokens 3-6x

by Zhengqi Pei, Qingming Huang +1 more

When LLMs Develop Languages: Symbolic Communication for Efficient Multi-Agent Reasoning

CLSR lets models evolve compact symbolic protocols that match CoT accuracy at far lower cost.

abstract click to expand

Chain-of-Thought (CoT) improves large language models (LLMs) on difficult reasoning tasks, but it often incurs long natural-language rationales that are poorly aligned with efficient machine reasoning. We propose Communicative Language Symbolism Routing (CLSR), a test-time framework in which multiple LLM agents autonomously invent, evolve, and share compact Language Symbolism Frameworks (LSFs), while a latent-free router adaptively selects and composes these languages per query to optimize the accuracy-token trade-off. Unlike prompt optimization that refines surface instructions, CLSR treats each LSF as a reusable symbolic protocol with compact symbols, usage rules, and a message-passing contract, and improves it through an evolutionary loop driven by correctness and token cost. At inference time, the router may invoke a single low-cost LSF call, ensemble multiple LSFs, or execute a multi-round LSF composition protocol on harder queries. Across challenging benchmarks, CLSR reduces latency-oriented generated token completion by $3\sim 6\times$ compared to standard CoT while maintaining accuracy. We further derive an information-theoretic lower bound on token cost under arbitrary symbolism and show that, under an interpreter-realizability premise, multi-round LSF protocols conditionally subsume program-execution pipelines. Code is publicly available (https://github.com/pzqpzq/LSF_MDia).

0

cs.CL 2026-06-29

KG traces lift travel LLM accuracy from 22% to 82%

by Vignesh Ram Nithin Kappagantula, Shayan Hassantabar +2 more

Travel-Oriented Reasoning Large Language Model via Domain-Specific Knowledge Graphs

Bottom-up walks of an expert travel graph create auditable multi-hop reasoning traces that embed domain rules into the model.

abstract click to expand

Large language models (LLMs) demonstrate broad reasoning abilities but struggle with accuracy and reliability in specialized domains such as travel, where reasoning depends on precise definitions, rules, and expert-defined conceptual frameworks, and where confident but unfounded outputs arise from a reasoning failure in which the model has not internalized the underlying domain graph rather than from missing domain knowledge alone. We propose a modular pipeline for building a travel-domain reasoning LLM grounded in an expert-designed knowledge graph (KG). Our pipeline integrates a travel KG that encodes domain entities and their relationships, a bottom-up construction procedure that walks the KG to produce multi-hop question answer (QA) pairs, a supervised fine-tuning stage that embeds the domain knowledge into a reasoning-capable LLM using the generated QA pairs as auditable reasoning traces, and a travel-domain benchmark dataset that measures the fine-tuned model's accuracy and calibration. We evaluate our approach using Qwen3-4B with LoRA adaptation. Our reasoning model achieves an $82.4\%$ exact match on the benchmark. This performance significantly outperforms the pretrained Qwen3-4B baseline at $22.4\%$. A calibration analysis decomposes the residual $17.57\%$ of errors into two distinct failure modes: an over-confident multi-label decoder that predicts both correct answers plus one spurious option on most dual-answer mistakes, and a smaller reasoning failure on single-answer questions where the supporting facts are present in the KG but the model fails to reconstruct the correct multi-hop path. This split confirms that explicit KG-grounded reasoning substantially improves the accuracy and uncertainty interpretation of LLMs in specialized domains, and isolates per-option calibration and trace-length-aware decoding as the next axes of improvement.

0

cs.NE 2026-06-29

Complex neuron merges signal strength with timed spikes

by Reza Ahmadvand, Sarah Safura Sharif +1 more

Unified Complex-valued Neural Network: A Magnitude-Phase Computational Model for Event-Driven Neuromorphic Learning

Magnitude handles continuous values while phase drives events, allowing one model for accurate neuromorphic spatiotemporal tasks.

abstract click to expand

Artificial neural networks (ANN) provide accurate continuous-valued representation, whereas spiking neural networks (SNN) offer event-driven temporal processing, yet both paradigms face limitations when value encoding and timing dynamics must be learned within a single computational structure. This paper introduces a network based on Unified Complex-valued Neuron (UCN), a new neural computational model that integrates continuous activation and phase-driven event generation through an asymmetric complex-valued state. In the UCN, magnitude encodes signal strength while phase governs intrinsic temporal evolution and valued spike emission. A foundational training framework combining backpropagation (BP) and backpropagation through time (BPTT) is first developed to optimize magnitude and phase pathways in a unified way. To reduce computational complexity, an event-driven adaptive phase learning (EAPL) rule is then introduced as a more efficient alternative. The proposed model is evaluated through object tracking and Lorenz attractor learning. Results demonstrate that UCN-based Network (UCNN) provides accurate, stable, and interpretable spatiotemporal learning while preserving sparse event-driven computation for neuromorphic and edge-AI applications.

0

cs.NE 2026-06-29

Neuromorphic chip beats CPU on energy for most graph searches

by Oskar von Seeler, Elena C. Offenberg +5 more

Road to scalability for efficient graph search on massively parallel neuromorphic hardware

NEURO-MAPP on SpiNNaker 2 uses less energy per shortest-path query than a modern CPU in almost all tested graphs while scaling well in speed

abstract click to expand

Efficient computation of shortest paths in weighted graphs is a fundamental problem with many applications. Neuromorphic hardware platforms promise massively parallel, efficient computation, changing parallelism tradeoffs. In this work, we introduce NEURO-MAPP (Neuromorphic-based Min-Add Parallel Propagation), a distributed shortest path algorithm designed to use the local computation and network communication available in neuromorphic systems. We provide an optimized implementation of the algorithm on the SpiNNaker 2 platform and evaluate its performance on a selection of synthetic and real-world graphs. These results are compared to Dijkstra's algorithm on a modern CPU. We find that the NEURO-MAPP implementation scales favorably in terms of runtime for many graph types while consuming less energy per shortest-path query than the CPU implementation in almost all cases. These findings highlight the potential of neuromorphic hardware featuring sparse, spike-based communication as a scalable and energy-efficient platform for computation in graph search and related tasks.

0

cs.LG 2026-06-29

Closed-form gradient steers networks toward flat minima

by Yuto Omae, Kazuki Sakai +4 more

Closed-Form Steepest Descent Direction toward Flat Minima: Reducing Upper Bounds on the Loss Hessian Eigenspectrum in Neural Networks

Deriving the gradient of the Wolkowicz-Styan upper bound supplies parameter updates that narrow the loss Hessian eigenvalue spectrum in thre

abstract click to expand

The flatness hypothesis suggests that flatness of the loss landscape, as measured by the eigenvalues of the loss Hessian, correlates with better neural network generalization. While various algorithms reduce these eigenvalues, most focus on procedural design, leaving it unclear how data distributions and NN parameters structurally determine directions toward flat minima. Characterizing these directions analytically is generally intractable. To overcome this mathematical difficulty, recent studies derived the Wolkowicz-Styan (WS) upper bound on the maximum eigenvalue of the cross-entropy loss Hessian in three-layer NNs. Although this upper bound is differentiable, its gradient was not derived. Therefore, we analytically derive the gradient of the WS upper bound to characterize directions leading to flat minima. Based on this, we propose Hessian Spectral Range (HSR) Regularization, which updates parameters along the steepest descent direction of the WS bound. Experiments demonstrate that HSR Regularization narrows the Hessian eigenvalue spectrum, avoids sharp minima and saddle points, and promotes convergence to flat minima. Although the applicability of this method is currently limited to cross-entropy loss and three-layer architectures, to the best of the authors' knowledge, this is the first study to report a closed-form gradient that promotes convergence to flat minima without numerical approximations. Therefore, the theoretical analysis of this gradient is expected to contribute to the further development of NNs.

0

cs.NE 2026-06-29

Variance dynamics model sets bat algorithm parameter bounds

by Xin-She Yang, Mehmet Karamanoglu

Analysis of Parameter Settings for the Bat Algorithm Using Variance Evolution

Theoretical ranges from population variance evolution match experiments and clarify exploration-to-exploitation shifts.

abstract click to expand

Parameter settings in evolutionary algorithms and metaheuristics are important because such parameter values can influence the performance of algorithms under evaluation. For a given algorithm, there are many different numerical experiments to show that the algorithm can work well in practice; however, in most cases there is no theoretical analysis of parameter settings. In this work, we show that theoretical analysis using the theory of dynamical systems and evolution of population variance can give some good results in terms of parameter ranges for the bat algorithm. We also show that results from numerical experiments are consistent with theoretical bounds. Such analyses can provide good insights from different perspectives about the algorithmic characteristics such as variance evolution, transition between exploration and exploitation as well as convergence behaviour.

0

cs.NE 2026-06-29

Spiking Q-network cuts DBS charge 80% while suppressing oscillations 45%

by Binh Nguyen, Colleen Josephson +3 more

Neuromorphic Energy-Aware Learning for Adaptive Deep Brain Stimulation

Energy-aware RL trains a controller that balances symptom relief against stimulation power and runs at 0.52 mW on neuromorphic hardware

abstract click to expand

Neuromorphic and edge computing research has focused on reducing the inference cost of neural network controllers, yet in physical closed-loop systems the actuator can rival or exceed an efficient controller in energy. An efficient controller is therefore necessary but not sufficient, because the actuator becomes the cost worth reducing once inference no longer dominates it. Here, we introduce energy-aware learning, an approach that incorporates actuator energy directly into the reinforcement learning reward, and demonstrate it in closed-loop deep brain stimulation (DBS) for Parkinson's disease. A deep spiking Q-network, trained in a biophysical cortico-basal ganglia-thalamic circuit model, learns to suppress pathological alpha-beta oscillations by 45.2% while reducing stimulation charge by 80.0% relative to continuous DBS. Sparsity-constrained knowledge distillation compresses the policy onto the SynSense XyloAudio 3 neuromorphic processor at 0.52 mW inference power, yielding 28.1x lower energy per inference than an equivalent artificial neural network on conventional edge hardware. By co-optimizing stimulation energy and inference efficiency, the framework addresses both major power demands in implantable neuromodulation.

0

math.OC 2026-06-29

Weighted sums miss non-supported points on concave Pareto fronts

by Olaf Frommann

Comparing Scalar Objective Functions for Multi-Criteria Engineering Optimization

Four scalar objective functions tested on analytic convex and concave fronts reveal differences in reachable engineering compromises.

abstract click to expand

Scalar objective functions are required when a multi-criteria optimization problem must yield a single preferred design rather than only a Pareto set. The choice of scalarization influences which compromise is selected, how preference parameters are interpreted, and whether non-supported Pareto regions can be reached. This paper compares four formulations for normalized bi-criteria minimization: weighted sums, achievement scalarizing functions, desirability functions, and a fuzzy-logic-based formulation. Two analytically defined Pareto fronts, one convex and one concave, isolate the effect of the objective formulation from numerical optimizer behavior. The comparison focuses on reachable Pareto regions, parameter-induced selection density, compensation between criteria, sensitivity, and interpretability. Results show that weighted sums are simple but structurally limited on concave fronts, while achievement, desirability, and fuzzy formulations reach interior non-supported regions through different mechanisms. Desirability functions introduce nonlinear single-criterion preference mappings, whereas fuzzy rules express nonseparable and reference-dependent engineering preferences.

0

cs.NE 2026-06-29

Metabolic loop enables endogenous adaptation in multi-agent optimizer

by Jinliang Xu, Liping Ma

MMAO: A Metabolic Multi-Agent Optimizer with Endogenous Resource Allocation for Continuous and Discrete Optimization

Fitness gains regulate agent sensing, roles and search without manual parameters for continuous and discrete tasks.

abstract click to expand

Traditional meta-heuristics often rely on fixed population sizes, manually chosen search scales, and externally attached parameter-control modules. This paper presents the \textit{Metabolic Multi-Agent Optimizer} (MMAO), a cross-domain optimization framework in which adaptation is derived endogenously from a private-public metabolic resource loop. Each agent carries internal energy, a continuous role state, motion or structural memory, and local search history, while the population shares a communal resource pool. Fitness improvements are converted into normalized metabolic gains through a robust progress scale and a recent success statistic; the same closed loop then regulates sensing intensity, search amplitude, role drift, branching, pruning, respawning, and elite reinvestment. In the continuous setting, MMAO uses energy-regulated symmetric zero-order probing and role-interpolated motion. In the discrete setting, the same control law is instantiated through structural sensing, local route improvement, guided perturbation, and energy-weighted edge reuse. The paper combines an implementation-faithful formulation with a reproducible experimental study on a CEC2017 subset (10D/30D, 20 seeds) and five TSPLIB instances (100 discrete runs in total). The current evidence supports MMAO primarily as a parameter-light, self-calibrating optimization framework whose main validated originality lies in metabolically endogenous resource allocation across heterogeneous search behaviors, rather than as a universally superior optimizer.

0

cs.AR 2026-06-29

Analog KANs with pruning cut area by 55% and power by 50%

by Paula Carolina Lozano Duarte, Georgios Zervakis +2 more

Co-Optimization of Analog Kolmogorov-Arnold Networks for Low-Power Function Approximation in Flexible Electronics

Error-aware training and multi-level pruning enable efficient on-sensor function approximation in flexible electronics for biosignals and ca

abstract click to expand

Wearable devices and Internet of Things (IoT) sensors require on-sensor processing of biosignals and environmental data, including computationally demanding operations such as nonlinear activation functions for neural network inference, sensor calibration curves to map raw readings to physical units, and signal preprocessing functions like logarithmic compression and power operations for feature extraction. These functions exhibit significant complexity, often involving transcendental operations and multivariate dependencies that are costly to implement digitally. Analog function approximation provides a power-efficient alternative by performing these computations in the analog domain, thereby reducing the energy overhead associated with analog-to-digital conversion and subsequent digital processing. Flexible Electronics (FE) present a particularly attractive platform for wearable applications due to mechanical flexibility and low-cost fabrication, but impose strict constraints on circuit density and power consumption, making efficient analog implementations critical but challenging. This work introduces Analog Kolmogorov-Arnold Networks (AKANs), developed via hardware-software co-optimization, to approximate these complex multivariate functions accurately under hardware imperfections. Our method incorporates circuit-level error modeling during training and applies pruning at both software and hardware levels to reduce area and power. Validation across multiple benchmarks demonstrates that our proposed pruning methodology not only reduces hardware cost but can also improve approximation accuracy by regularizing spline parameters. Results show up to 55% area and 50% power savings, with average reductions of nearly 30% across datasets, highlighting AKANs as a robust and generalizable framework for low-power analog function approximation in FE.

0

cs.NE 2026-06-29

SNN pruning reaches 95.6% accuracy at 90% sparsity

by Muhammad Hamza

Criticality-Constrained Iterative Pruning for Energy-Efficient Spiking Neural Networks via Combined Importance Scoring

Criticality-constrained quadratic pruning exceeds magnitude pruning by 2.2 points and enables 73% energy reduction at 70% sparsity via combi

abstract click to expand

Deploying spiking neural networks (SNNs) on neuromorphic hardware demands aggressive synaptic pruning while preserving temporal computation integrity. Existing strategies either neglect neuronal criticality or rely on convex relaxations of the inherently combinatorial pruning problem whose fractional masks, upon binarisation, destroy accuracy at moderate-to-high sparsity. We present Criticality-Constrained Quadratic Pruning (CQP), a native PyTorch pipeline that fuses weight magnitude with surrogate-gradient criticality into an analytically exact importance metric, eliminating the rounding artefacts endemic to solver-based approaches. We formally characterise a continuous-relaxation trap wherein OSQP-solver fractional masks overshoot the intended sparsity by up to 12 percentage points (pp), precipitating a 44 pp accuracy collapse. We identify and remediate a zombie-weight failure mode in which Adam's first-moment tensors resurrect pruned synapses, violating the binary sparsity guarantee. An iterative schedule - prune, fine-tune with gradient masking, recompute criticality, and repeat - eliminates gradient staleness at high sparsity. A KL-divergence temporal analysis identifies a redundant simulation timestep, enabling a free 10% theoretical energy reduction without weight modification. On MNIST (60,000 training examples), CQP yields 95.6% accuracy at 90% sparsity versus 93.4% for magnitude pruning (+2.2 pp). A criticality-threshold sweep reveals an empirical criticality cliff: accuracy falls from 87.0% to 14.4% as the threshold reaches tau = 0.9, constituting a quantitative SNN-level analogue of the Critical Brain Hypothesis. Combined weight sparsification and temporal truncation yield a compound 73% reduction in per-inference energy at 70% sparsity, confirming the practical value of the proposed pipeline for neuromorphic deployment.

0

q-bio.NC 2026-06-29

Toolkit unifies CANN simulation and attractor analysis

by Sichao He, Aiersi Tuerhong +5 more

CANNs: A Toolkit for Research on Continuous Attractor Neural Networks

Python library, Rust backend and homology pipeline recover results on spatial and directional encoding.

abstract click to expand

Continuous attractor neural networks (CANNs) are the canonical computational framework for how the brain encodes continuous variables such as spatial position, head direction, and movement direction, and explain the activity of hippocampal place cells, entorhinal grid cells, and head-direction cells. CANN research, however, is fragmented: most results rest on lab-specific implementations, general-purpose simulators lack CANN-specific abstractions, and the path from spike trains to attractor geometry in real recordings lacks a standardized toolkit. Here, we present a comprehensive open-source toolkit that unifies the full CANN research workflow. It combines three tightly integrated components: 1) canns, a Python library on BrainPy/JAX that provides standardized 1D/2D CANNs, spike-frequency-adaptation variants, grid cell networks, hierarchical path-integration models, and brain-inspired attractor architectures, together with curated datasets, task generators, an analyzer module and trainer modules for biologically plausible plasticity; 2) canns-lib, a Rust acceleration backend delivering hundreds-of-times speedups for spatial-navigation workloads and modest gains for Ripser-based persistent homology; 3) ASA (Attractor Structure Analyzer), a PySide6 pipeline applying persistent homology and cohomology to experimental neural recordings to detect ring-like and toroidal attractor signatures in real data. The toolkit ships with full-detail reproducible pipelines that recover recent CANN results including SFA-driven anticipative tracking, theta sweeps in head-direction/place/grid systems, and hierarchical path integration.

0

cs.NE 2026-06-29

Late local search boosts constrained DE by 5.58 percent in U-score

by Dikshit Chauhan, Anupam Trivedi

DE-2LS: Differential Evolution with Lightweight Late Local Search for Constrained Numerical Optimization

DE-2LS adds small end-of-run polishing to improve accuracy while retaining the base method's speed.

abstract click to expand

Constrained single-objective numerical optimization requires a careful balance among feasibility, objective convergence, and computational efficiency under a fixed function-evaluation budget. This paper proposes DE-2LS, a late-stage, locally search-enhanced variant of differential evolution built on the RDEx framework. The proposed method preserves the original RDEx components, including mutation and crossover operators, success-history adaptation, archive mechanism, population-size reduction, and $\epsilon$-based constraint handling. A lightweight coordinate-pattern local search is added as a guarded polishing component around the current best solution. It is activated only in the late stage of the run, uses a small evaluation budget, and accepts candidates through a feasibility-aware comparison rule. Ablation results show that the finalized DE-2LS configuration achieves the best U-score among all tested variants, confirming that controlled late-stage refinement is more effective than aggressive or premature local search. In the direct comparison with RDEx, DE-2LS achieves a 5.58\% gain in U-score. In the four-algorithm comparison, DE-2LS obtains the highest overall U-score of 80968 and the best total rank of 48 among RDEx, CL-SRDE, and UDE-III. These results indicate that DE-2LS improves the exploitation capability of the RDEx-based search framework while preserving its speed advantage under the combined speed-accuracy scoring criterion. The source code of DE-2LS is available at https://github.com/ChauhanDikshit?tab=repositories.

0

cs.NE 2026-06-29

Late local search raises DE optimization scores by 11%

by Dikshit Chauhan

DE-2LS: Differential Evolution with Late-Stage local-search for Unconstrained Single-Objective Numerical Optimization

DE-2LS refines RDEx with smoothed branch rates and guarded pattern search to improve both speed and final quality on single-objective tasks.

abstract click to expand

Unconstrained single-objective numerical optimization requires a careful balance among global exploration, late-stage exploitation, and function-evaluation efficiency. This paper presents DE-2LS, a late-stage, local-search-enhanced differential evolution framework built on RDEx for unconstrained single-objective optimization with variable bounds. The proposed method preserves the original RDEx evolutionary search engine and introduces two conservative refinements: a smoothed exploitation-biased branch-rate update in the late search stage and a guarded coordinate-pattern local-search that serves as a budget-aware refinement mechanism. Since the considered setting is unconstrained apart from variable bounds, all selection and local-search acceptance decisions are based solely on objective values. To determine the final algorithm configuration, we conduct a staged ablation study by testing multiple settings of the EB-rate smoothing mechanism, the initial EB-rate, the standard-branch Gaussian sampling scale, the selection-pressure parameters, and the local-search coefficients. The final configuration is selected using a U-score-based evaluation that jointly reflects solution quality and convergence speed. Experimental results show that DE-2LS consistently improves the original RDEx in direct head-to-head comparison. In particular, DE-2LS increases the U-score from $33602.0$ to $37448.0$, corresponding to an improvement of $11.45\%$. Moreover, compared with several competitive and IEEE CEC-winning algorithms, DE-2LS achieves the best overall U-score of $178966.5$, outperforming the others by $34.43\%$. These results show that a carefully designed late-stage local-search strategy can improve both convergence speed and the final objective quality of the algorithm. The source code of DE-2LS is available at https://github.com/ChauhanDikshit?tab=repositories.

0

cs.NE 2026-06-26

Fourier evolution separates scaffold and substructure edits in molecules

by Elia Colleoni, Paolo Guida +2 more

Multi-Objective Molecular Generation with Frequency-Controlled Evolutionary Dynamics

SpectralMol projects structures to a fixed frequency basis so NSGA-II optimizes multiple goals without pre-training or scalar rewards.

abstract click to expand

Molecule generation methods that leverage generative models have been successfully applied to drug discovery. However, they often require extensive pre-training, suffer statistical biases in the training data, and might suffer from limited interpretability of generated chemical structures. In this work, we introduce SpectralMol, an algorithm based on evolutionary computation that processes chemical structures as a compact matrix of Fourier coefficients, projected onto a fixed basis to generate position-wise latent vectors for SELFIES decoding. The NSGA-II algorithm enforces diversity and enable separate objective functions rather than collapsed objectives into a scalar reward. The quality of the algorithm was tested against standardized benchmarks. The results show comparable aggregate benchmark performance with a task-dependent profile: SpectralMol is strongest on several multi-parameter optimization tasks. The same benchmark was used to perform an ablation study to demonstrate the advantages of a structured latent matrix. Finally, method was tested on a realistic ClpP-targeted drug-discovery benchmark, comparing it with the reinforcement-learning-based model under a fixed oracle-call budget. SpectralMol generates more docking hits and more diverse scaffolds while maintaining competitive physicochemical properties. The representation adopted in this work can cleanly separates scaffold-level modifications from localized substructure variations, as the former occur with perturbations of low-frequency Fourier modes and the latter with perturbations of high-frequency Fourier modes. The results support the evidence that frequency-controlled evolutionary dynamics provide an interpretable, efficient, and training-free route to multi-objective molecular design.

0

cs.CL 2026-06-26

Key-axis erase supplies content signals to recurrent gates

by Sayak Dutta

CARVE: Content-Aware Recurrent with Value Efficiency for Chunk-Parallel Linear Attention

CARVE reuses output tensors for decisions, cuts parameters 19 percent, and tops retrieval benchmarks at 0.4 percent overhead.

abstract click to expand

Recurrent models must forget in order to remember, yet the state of the art decides what to erase without consulting what is stored -- the gate sees only the arriving token, not the memory it is about to modify. This memory-blind gating is one of three coupled defects in the leading delta-rule architecture (GDN-2): the value-axis erase mask wastes parameters at the scale of the value projection, and -- as we prove -- mathematically prevents the WY-form triangular chunk solver that makes recurrent training competitive with Transformers. We introduce CARVE (Content-Aware Recurrent with Value Efficiency), which resolves all three problems through one principle: erase only on the key axis. This is provably necessary and sufficient for the WY-form solver to remain valid. Within it, CARVE reuses the recurrent output tensor -- already written to GPU memory -- as a free content signal for the erase gate, and replaces the per-value write-gate projection with a single scalar per head. At initialisation CARVE is bit-identical to GDN-2; any quality difference emerges from what the content gate learns. At 1.3B parameters trained on 100B tokens, CARVE achieves WikiText perplexity 15.72 (minus 0.18 vs. GDN-2, a 4.5-sigma effect), leads every recurrent baseline on nine common-sense reasoning benchmarks, and sets state of the art on every RULER retrieval probe -- at 0.4% throughput overhead, 13% lower peak memory, and 19% fewer parameters. Six formal theorems cover memory capacity, Lyapunov stability, gradient flow, expressivity separation, Pareto-optimal chunk size, and hybrid optimality.

0

q-bio.NC 2026-06-26

Output-use feedback builds organized networks in agent systems

by Claus Metzner, Ali Ghebleh +4 more

Surviving by Serving: Functional Relevance Drives Self-Organization in Complex Adaptive Systems

A minimal model shows agents persist when their outputs are used and adapt when ignored, forming chains and core-periphery structures withou

abstract click to expand

Complex adaptive systems often develop organized structures without centralized control. Yet the local mechanisms by which functional organization emerges and persists remain incompletely understood. Here we propose Surviving by Serving (SBS) as a general principle of self-organization: components persist as long as their outputs are utilized by other components, whereas prolonged non-utilization promotes adaptation and exploration. To investigate this idea, we introduce a minimal multi-agent model in which agents transform shared resources and receive only local feedback when their outputs are subsequently utilized elsewhere in the system. Despite the absence of global objectives, the system spontaneously self-organizes into functional interaction networks. We observe the emergence of stable transformation chains, core-periphery organization, and the generation of novel states that enable previously inaccessible target conditions to be reached. Remarkably, self-sustaining interaction networks can arise even without external selection pressures, creating a pre-adaptive search phase from which later functional solutions emerge. These findings suggest that functional utilization may provide a simple, substrate-independent mechanism for the emergence and stabilization of organized structure in complex adaptive systems.

0

cs.NE 2026-06-26

Bézier curves steer random walks for competitive optimization

by Jinpeng Wang, Xingguo Xu +4 more

Random Walk on B\'ezier Curves for Global Optimization

Adaptive curve orders shift from wide exploration to local refinement on 41 CEC functions up to 100 dimensions.

abstract click to expand

Balancing exploration and exploitation remains a central challenge in metaheuristic optimization. To address this issue, this paper proposes B\'ezier Walk Evolution (BWE), a geometry-driven optimization framework that reformulates evolutionary search as adaptive trajectory construction in the decision space. BWE integrates B\'ezier curve modeling with a distance-aware random walk mechanism to generate topology-guided search trajectories. By adaptively varying the curve order during evolution, the proposed method enables a smooth transition from diversified global exploration to refined local exploitation. Higher-order B\'ezier curves leverage multiple population-derived control points to enhance search diversity, while lower-order curves generate near-linear trajectories to improve convergence efficiency. This adaptive geometric search mechanism provides an interpretable alternative to conventional nature-inspired designs. Extensive experiments on 41 benchmark functions from the CEC2017 and CEC2022 suites, spanning dimensions from 10 to 100, show that BWE achieves strong overall performance and favorable scalability compared with 7 classical and 6 state-of-the-art optimizers, including L-SHADE and CMA-ES. Additional evaluations on five constrained engineering design problems further demonstrate the practical applicability and robustness of BWE.

0

math.OC 2026-06-26

R2 subset selection is NP-hard in 3 objectives but greedy achieves 1-1/e approx

by Michael T. M. Emmerich

Three-Objective Integral R2 Subset Selection: NP-Hardness and Submodular Approximation

The integral R2 gain forms a monotone submodular set function, enabling reliable approximation despite exact hardness.

abstract click to expand

Selecting a fixed number of representative points from a finite Pareto-front approximation is a fundamental post-processing task in multiobjective optimization. This paper studies this problem for the integral R2 indicator in three objectives, where the indicator is defined as the integral of the lower envelope of weighted Tchebycheff scalarizations over the two-dimensional weight simplex. We provide two complementary algorithmic results. On the positive side, we show that the integral R2 improvement with respect to any fixed baseline is a monotone submodular set function. For the usual ideal-point based R2 indicator, with the ideal point fixed, this yields a direct gap-reduction guarantee: greedy selection closes at least a $(1-1/e)$-fraction of the maximum possible R2 gap between a fixed dominated anchor value and the best cardinality-$k$ value. We also give a tested greedy implementation that evaluates exact integral R2 values by subdivision, with worst-case running time $O(n^6)$. On the negative side, we prove that exact fixed-cardinality subset selection is NP-hard already in three objectives. The hardness proof uses a perspective transformation that maps Tchebycheff-shadow improvements to a weighted anchored-box union problem with density $(x_1+x_2+x_3)^{-4}$, and then adapts the three-dimensional anchored-box construction of Bringmann, Cabello, and Emmerich. Together, these results separate the tractable two-objective case from the three-objective case while identifying a principled approximation route based on submodular optimization.

1 0

0

cs.LG 2026-06-25

Epoch structure lets agents and evaluators co-evolve while keeping improvement guarantees

by Alex Iacob, Andrej Jovanović +11 more

The Red Queen G\"odel Machine: Co-Evolving Agents and Their Evaluators

Fixed intra-epoch criteria allow utility updates at boundaries, raising coding pass rates and acceptance rates with 1.35x-1.72x fewer tokens

abstract click to expand

Self-improving agents are state-of-the-art (SOTA) on agentic coding benchmarks and have recently been extended to general domains. However, their search methods generally assume a stationary evaluation criterion: a fixed verifier, benchmark, or labeled dataset that remains valid as the agent improves. This ignores a central feature of evolution: species adapt as their environments change with them. We aim to bring the same principle to recursive self-improvement, making evaluation part of the improvement loop and opening search to evolving evaluators, adversarial objectives, and dynamic utilities that may surpass static benchmarks. We introduce the Red Queen Godel Machine (RQGM), an evolutionary framework for recursive self-improvement under non-stationary utilities. The RQGM makes this possible through controlled utility evolution: search is organized into epochs with a fixed within-epoch evaluation criterion, while the utility can be updated at epoch boundaries, so self-improvement guarantees hold per epoch as the objective evolves across them. We begin by showing that even on verifiable coding tasks, the RQGM improves test pass rate over the prior SOTA by adding a complementary agent-as-a-judge code-review signal. This signal is cheaper and the RQGM uses 1.35x-1.72x fewer tokens. We then turn to scientific paper writing and reviewing, and Olympiad-level proof writing and grading, where the RQGM improves performance over prior self-improving agents: co-evolved writers reach 1.78x-1.86x higher acceptance rates under a diverse agent-as-a-judge panel, while co-evolved graders reach 9% higher ground-truth accuracy. In paper reviewing, the strongest baseline reviewer over-accepts AI-generated papers at up to 1.91x the human rate. The RQGM corrects this by introducing an adversarial objective that discovers reviewers equally stringent on AI and human work.

0

cs.NE 2026-06-25

Spacing objective alone evolves flock alignment

by Craig Reynolds

EvoFlock: evolved inverse design of multi-agent motion

Genetic search finds parameters where neighbor distance maintenance produces coordinated group motion without dedicated alignment rules.

abstract click to expand

This paper describes an automatic method for adjusting or tuning models of multi-agent motion. Simulating the motion of bird flocks, human crowds, vehicle traffic, and other multi-agent systems is a widely used technique. These simulations model the behavior of a single group member (bird, human, or vehicle). The group behaviors (flock, crowd, traffic) emerge from interactions between group members. These models typically have many numerical control parameters. Even if each parameter is intuitive in isolation, their interaction can be complex and nonlinear. It is challenging to determine which parameters to adjust for the desired change in group behavior. Changing one aspect of group behavior often causes other aspects to change, leading to a tedious process of incremental changes. This work takes an inverse design approach. The desired group behavior is measured with a user-defined objective(/fitness/loss) function and optimized with a genetic algorithm. The objective function used here for basic flocking rewards proper spacing with neighbors, flying near a desired speed, and avoiding obstacles. Interestingly, the vivid alignment seen in bird flocks appears to emerge from maintaining proper spacing between flockmates.

0

cs.NE 2026-06-24

Noise fields pick subnetworks to store multiple functions in one net

by Shuhei Ikemoto, Fabio DallaLibera

Spatial Partial Functionalization of Neural Networks based on Noise Fields

Capacity rises when noise locations match functional proximity in 1D tasks; mismatches lower it.

abstract click to expand

Noise in neural computation is typically regarded as a disturbance, but its spatial distribution may also actively regulate which parts of a network participate in computation. This paper investigates the spatial partial functionalization of Noise-modulated Neural Networks using noise fields. We first present an activation function suitable for this goal, the crossing activation function, using the sample-level, statistical-level, and analytical-level implementations, and examine parameter reuse across these implementations. We then introduce a virtual noise field, an auxiliary continuous space for generating spatially structured network noise fields that activate partially overlapping subnetworks. Using one-dimensional function approximation tasks, we evaluate how multiple functions can be stored in a single network when each function is assigned to a different noise-field location. The results show that memory capacity improves when the spatial arrangement of noise fields reflects the proximity relationships among the functions to be learned, whereas mismatches in noise field structure can reduce effective capacity. These findings suggest that structured noise can serve not only as a perturbation but also as a topology-defining factor for functional subnetwork selection.

0

cs.SD 2026-06-24

Cancer speech model correlates most with spectral features

by Tuan Nguyen (LIA, AU) +7 more

What Does a Pathological Speech Assessment Model Know about Acoustic Features? A Case Study on Oral and Oropharyngeal Cancer Patients

Spectral group reaches 0.77 correlation and prosodic group reaches 0.71 in Wav2Vec embeddings.

abstract click to expand

This work investigates the interpretability of a Wav2Vec 2.0based speech intelligibility assessment model for oral and oropharyngeal cancer patients through canonical correlation analysis. By measuring the correlation between the model embeddings and eGeMAPS low-level descriptors (LLDs) as an interpretable reference, we analyze how acoustic information is encoded across the model layers. The analysis is conducted at two levels: individual LLDs layer-wise, and group-level: prosodic, spectral, and voice quality. Results show that the learned representations are most strongly correlated with spectral and prosodic features, with the first MFCC coefficient yielding the highest correlations across all layers. At the group level, spectral and prosodic groups achieve correlations of 0.77 and 0.71 respectively, while voice quality reaches 0.65. Beyond model interpretability, this work also offers practical guidance on acoustic feature selection for pathological speech assessment.

1 0

0

cs.NE 2026-06-24

Speciated search matches peak toxicity with lower cumulative pressure

by Onkar Shelar, Travis Desell

Distributed Quality-Diversity Search for Toxicity in Large Language Models

ToxSearch-S equals prior methods on worst-case toxicity while keeping the search path less toxic and running faster on multiple workers.

abstract click to expand

Large Language Models remain vulnerable to adversarial prompts that elicit harmful responses, and scaling red-teaming to cover a broad range of failure modes is constrained by the cost of text generation and evaluation. We present \emph{ToxSearch-S}, a speciated extension of toxicity-focused evolutionary prompt search with incremental, embedding-driven niche maintenance, together with an MPI master-worker realization that centralizes population and species bookkeeping on rank~0 while offloading prompt evolution and evaluation to $n_w$ parallel workers. Under a common budget, ToxSearch-S attains peak toxicity competitive with both ToxSearch and RainbowPlus while following a measurably less toxic best-so-far trajectory, indicating lower cumulative search pressure. Diversity is non-uni-dimensional: RainbowPlus yields greater embedding-level spread, whereas ToxSearch-S partitions high-toxicity prompts into more localized behavioral pockets, reflected by a higher DBSCAN cluster count. MPI distribution delivers substantial wall-clock gains, approximately $1.8\times$ with two workers and $3.2\times$ with four, while leaving Best@B statistically indistinguishable from sequential execution. Four-worker runs also produce significantly larger final species cardinality and more toxicity-bearing species, without a reliable gain in global peak toxicity. These results position incremental speciation as a practical quality-diversity mechanism for AI Safety and MPI as an effective means of compressing time-to-result while preserving measured search outcomes.

0

q-bio.NC 2026-06-23

Local 2- and 3-cycles boost RNN computation on Boolean tasks

by Tom Talpir, Elad Schneidman

Identifying structural design principles shaping the computational abilities of recurrent neural networks

Networks with these short cycles solve more functions, are often minimal, and are predicted by a few structural measures.

abstract click to expand

Understanding how the architecture of neural networks shapes the computations they carry is a central challenge in neuroscience and machine learning. While specific circuit architectures have been linked to particular network computations and theoretical bounds on expressivity of broad classes of networks have been found, we are still missing general principles connecting the structure of finite networks to their computational capabilities. Here, we characterize the computational abilities of recurrent neural networks as a function of their connectivity by training a large collection of different networks to compute a large set of Boolean functions. For small networks, we constructed the complete ``catalogs'' of network-function performance, which revealed that computational capacity varies widely across architectures and that most networks show poor performance, and most functions are hard to compute. However, we show that having local 2- and 3-cycles in a network strongly enhances its computational ability, and networks with such cycles are often the minimal architectures that can solve particular functions. We further show that a small set of structural statistics accurately predict networks' performance. Extending our analysis to large networks showed that typical networks fail even to approximate a randomly selected function. Surprisingly, adding a small number of sparsely connected biologically-inspired interneurons to the network dramatically increases computational capacity. As in small networks, adding short cycles improved networks' capacity, outperforming acyclic or reachability-matched controls. Thus, our results identify local cycles as design principles linking neural connectivity to computational power, and offer a general framework to explore structure-function relations in computing networks.

0

cs.LG 2026-06-23

Quadratic activation lets small nets master Game of Life

by Tashin Ahmed, Q. Tyrell Davis

It's Much Easier for Neural Networks to learn Game of Life Dynamics with the Right Activation Function: Polynomial Kolmogorov-Arnold Networks

A second-degree polynomial matches the update rules, enabling reliable learning even when weights stay frozen.

abstract click to expand

Previous work has found a gap between the scale of neural networks that reliably learn Conway's Game of Life, and minimal networks capable of representing the classic cellular automaton with hard-coded parameter values. Viewing neural network learning as a search process suggests a dependence on networks large enough to contain sub-networks with lucky initializations (sometimes known as 'winning tickets') that actually learn the task. In this work, we reorient our perspective from discovering Life rules as a search problem back to a learning problem, and reason that with fitting inductive biases, the problem should be much more amenable to minimal networks. We find that network variants with several alternative activation functions meaningfully outperform the default choice of Rectified Linear Units, and in particular, that a 2nd degree polynomial activation function consistently learns Life dynamics with or without the benefit of learning neural weights. Our results provide an informative demonstration of the benefits of matching learning to the task at hand and challenge the easy default choice of scale for all problems. In particular, we advocate for the use of cellular automata as simple test domains for developing strategies that can benefit machine learning for science, physics-based deep learning, and interpretable machine learning.

0

cs.ET 2026-06-23

16-bit LFSR sets firing probability in open stochastic LIF neuron

by Poornima Kumaresan, Santhosh Sivasubramani

An Open-Source LFSR-Based Stochastic Leaky Integrate-and-Fire Neuron in SkyWater 130 nm: Design, Stochastic Characterisation, and Rate Coding

Eight-entry table and leaky integrator deliver monotonic rate coding and controlled randomness in 130 nm standard-cell CMOS.

abstract click to expand

Stochastic spiking neurons trade exact arithmetic for controlled randomness, lowering area and tolerating input noise, which suits event-driven edge hardware. We present a compact, configurable stochastic leaky integrate-and-fire neuron in standard-cell CMOS on the SkyWater 130 nm process, released openly. A 16-bit configurable-polynomial linear-feedback shift register drives an eight-entry programmable activation table that sets a Bernoulli firing probability, and a saturating 16-bit leaky integrator with a programmable threshold and a refractory period of zero to seven cycles produces the spike train. All parameters are set through a sixteen-register serial interface, and the neuron runs from parallel inputs or entirely from the register file. From a model checked bit-exact against the register-transfer code, the period is 65535 states for a maximal-length polynomial and 63 for the shipped default, the eight-bit comparison value is uniform over the full period, and the per-entry firing probability equals the table value divided by 256. We also characterise a property a system-level model would not expose: the comparator output is serially correlated at short lags, with a negative lobe near lag eight, because the compared byte shifts by one bit each cycle; subsampling every sixteen cycles restores whiteness. Rate-coding sweeps show monotonic control of the output rate by the input weight and the threshold, and the refractory period caps the rate at one spike per refractory-plus-one cycles. The neuron occupies about 10,600 square micrometres at 70 per cent utilisation on a single Tiny Tapeout tile, meets 50 MHz timing with positive margin, and passes eighteen directed cocotb tests at register-transfer and gate level. All results are pre-silicon, from simulation and the open flow. The neuron is an openly released companion to a four-block neuromorphic suite reported separately.

0

cs.NE 2026-06-23

Pheromone traces let local synapses learn without global gradients

by Xingcheng Fu, Xianjun Chen +1 more

Local Pheromone Network: Sparse Local Learning with Multi-Scale Synaptic Trails, Consolidation, and Replay

Adaptive Hebbian updates on budgeted local connections handle partitioned and conflicting memories in a prototype network.

abstract click to expand

Backpropagation-trained dense neural networks are powerful function approximators, but they couple learning across many parameters and can overwrite previous associations when tasks conflict. This paper describes Local Pheromone Network, a small research prototype for sparse, local, manually updated neural networks. In Local Pheromone Network, each output unit reads only a fixed local neighborhood of input units subject to geometric distance and molecular-tag compatibility. Each synapse stores a weight, a short-term pheromone trace, a long-term pheromone trace, and an optional consolidation state. Training does not call automatic differentiation. Instead, every layer performs a pheromone-weighted Hebbian-style update on a budgeted subset of local synapses selected from local error and co-activity. The update budget adapts online: it shrinks when loss improves and expands toward recently active neighborhoods when loss worsens. Optional mechanisms add structural plasticity, local replay, output masks for partitioned learning, and a target-free local contrastive step. We present the implementation, learning rule, and preliminary experiments on synthetic regression, partitioned memory, conflicting memory, consolidated conflict, structural plasticity, replay, and a synthetic long-context hybrid memory task. The prototype learns local linear rules, preserves partitioned memories through tags and masks, reduces forgetting under consolidation, and uses replay under conflict.

0

cs.LG 2026-06-23

EML trees approximate any W^{k,∞} function

by Joe Germany, Elie Abdo +1 more

EML Trees Are Universal Approximators

Tree compositions of the exp-minus-log function match functions with bounded weak derivatives up to order k to any accuracy.

abstract click to expand

The recently introduced EML (Exp-Minus-Log) function acts as continuous analogue of NAND gates, providing a compositional building block capable of representing elementary functions. In this work, we study the expressive power of tree-structured compositions of EML functions. We show that such trees enjoy a universal approximation property for functions in $W^{k, \infty}$ for $k \in \mathbb N$, drawing on classical neural network approximation arguments while exploiting the ability to explicitly construct EML trees that mimic polynomial representations. We further propose a learning algorithm for EML-type trees equipped with fitting parameters, and demonstrate its feasibility in practical optimization problems. Our results establish EML trees as a theoretically grounded framework for function approximation.

0

cs.AI 2026-06-23

Market simulation mechanisms act as separate control knobs

by Zhibao Chen

Decomposing Financial Market Dynamics via Mechanism Analysis in an Evolutionary Multi-Agent Simulation

Selection tunes diversity, price feedback tunes realism, and bias tunes fragility while consensus shows no effect in matched interventions.

abstract click to expand

Evolutionary agent-based markets (ABMs) couple several mechanisms -- who reproduces, how price forms, how biased the agents are, how consensus propagates -- yet these are usually fixed by convention, so it is unclear which mechanism controls which emergent property. In a coevolving, endogenous-price simulator with 120 heterogeneous behavioral agents, we make four mechanisms pluggable and run matched 3x20-seed interventions. We find the levers are largely separable. (1) Selection -> diversity: a Quality-Diversity (QD/MAP-Elites) operator robustly raises strategy-mix entropy over truncation top-k (paired Delta entropy +0.27 to +1.12 bits; sign-test p<0.001; CIs exclude 0) and sustains more strategy cycling (strongest in crisis: Delta=+0.070, p=0.0004). (2) Selection does not improve realism: even a per-agent realism reward that provably steers selection does not raise 5-fact realism (Delta_5=-0.11,-0.08,+0.03; not significant). (3) Microstructure -> realism: enabling reflexive price feedback does raise realism (Delta_5=+0.13,+0.20,+0.20; crisis/bull p<0.05, all CIs positive). (4) Behavior -> fragility: amplifying behavioral bias raises a genomic fragility proxy (Delta=+10.5,+11.1,+14.4; bull p<0.001, all CIs positive) while leaving realism flat. The remaining mechanism -- consensus network topology -- shows no robust effect (honest null). The contribution is a decomposition: in these single-mechanism sweeps the mechanisms behave as approximately distinct control knobs over diversity, realism, and fragility.

0

quant-ph 2026-06-23

Self-modulation stabilizes quantum fast-weight sequential learning

by Samuel Yen-Chi Chen, Yifeng Peng +9 more

Self-Modulating Quantum Fast-Weight Programmers for Efficient Adaptive Sequential Learning

Adaptive scaling of new updates and stored memory yields steadier convergence and higher accuracy across qubit counts and sequence lengths.

abstract click to expand

Recent advances in quantum machine learning have motivated efficient models for sequential data processing. In this paper, we propose Self-Modulating Quantum Fast Weight Programmers, or Self-Modulating QFWP, which extends Quantum Fast Weight Programmers by introducing adaptive modulation over both newly generated fast-weight updates and historical fast-weight memory. Numerical results show that the proposed mechanism improves convergence stability and prediction performance across varying model settings, including different numbers of qubits and input sequence lengths. We further provide theoretical arguments explaining how self-modulation balances new information injection with memory retention, thereby enhancing temporal information propagation. These results suggest that Self-Modulating QFWP is a compact and effective framework for quantum machine learning on time-series data.

1 0

0

quant-ph 2026-06-23

Recursive QLSTM processes variable-length sequences more effectively

by Samuel Yen-Chi Chen, Yifeng Peng +9 more

Recursive QLSTM with Dynamic Variational Quantum Circuit Adaptation

Metacore recursion and numerical tests show gains in temporal information flow across different input lengths.

abstract click to expand

Recent advances in quantum computing and machine learning have motivated the development of quantum models for sequential data processing. In this paper, we propose a Recursive Quantum Long Short-Term Memory model, or Recursive QLSTM, which extends QLSTM through metacore-based recursive constructions. We numerically test the model under different input sequence lengths, metacore designs, and recursive rules, and identify the best-performing architecture among these variants. For this selected model, we further provide theoretical arguments explaining why its recursive structure improves temporal information propagation and enhances learning performance. Our results suggest that Recursive QLSTM offers a flexible and effective framework for quantum recurrent learning over input time series of various lengths.

1 0

0

cs.NE 2026-06-23

Mass conservation steers NCA reservoirs to stronger criticality

by Tong Zhang, Etienne Guichard +2 more

Mass Conservation as an Inductive Bias for Self-Organized Criticality in NCA Reservoirs

Conserved lattices produce power-law avalanches in more evolutionary runs, evolve faster, and match standard NCA on memory, classification,

abstract click to expand

Self-organized criticality (SOC), a dynamical regime associated with maximal information processing, offers a promising foundation for reservoir computing. Recent work has shown that neural cellular automata (NCA) can be evolved toward critical avalanche dynamics and employed as effective reservoirs for memory and classification tasks. Here, we investigate whether mass conservation -- a local redistribution rule that preserves total lattice mass -- serves as an inductive bias toward SOC in evolved NCA reservoirs. We compare mass-conserving and standard NCA across multiple independent runs and evaluate both on three downstream benchmarks: 5-bit sequential memory, MNIST digit classification, and CartPole-v1 temporal control. Mass-conserving NCA consistently exhibit stronger criticality, with more runs achieving perfect power-law fits across avalanche distributions, while also being 1.27$\times$ faster during evolution. Importantly, conservation does not impair downstream utility: both variants achieve comparable performance across all three tasks. Furthermore, the reservoir with perfect criticality achieves the highest temporal control score, suggesting a positive link between SOC quality and sequential computation. Our results demonstrate that mass conservation is a simple, effective mechanism for promoting robust criticality in evolved NCA reservoirs without sacrificing downstream performance.

0

cs.LG 2026-06-23

EEG datasets gain a shared task specification layer

by Chengxuan Qin, Zhige Chen +10 more

EEG Benchmarking Needs a Task Specification Layer: NeuroDoc for Rulebook-Guided, Executable Benchmark Construction

A rulebook and task documents convert 53 heterogeneous recordings into 245 executable units that run across foundation model backbones.

abstract click to expand

Electroencephalography (EEG) foundation models increasingly rely on multi-dataset training and evaluation, yet public EEG datasets still lack a shared task specification layer that can turn heterogeneous recordings into reusable benchmark units. Existing standards organize files, metadata, and provenance, but they do not specify EEG tasks under a common language and rulebook, leaving critical task semantics scattered across papers, code, and manual interpretation. We investigate whether heterogeneous public EEG datasets can be standardized through a structured task specification language paired with a shared rulebook. Our methodology represents each benchmark entry as a task document synchronized with an executable task kernel, with the rulebook defining task fields, evidence requirements, document-kernel alignment, review states, and machine-checkable constraints. Using this methodology, we release a community-reviewed EEG benchmark corpus centered on 53 completed and reviewed entries with 245 task definitions spanning diverse paradigms, and we introduce NeuroDoc and NeuroAudit as the operational support layer for rulebook-guided drafting, upgrading, review, amendment, and release management. We further examine whether the resulting benchmark units can be instantiated in a shared downstream setting across four EEG foundation model backbones, providing execution-based evidence for reusable, auditable, and executable EEG benchmarking infrastructure.

0

cs.NE 2026-06-23

Evolution locks reservoir networks into shared spectral rules for chaos prediction

by Nima Dehghani

Evolutionary Optimization Reveals Structural Constraints on Reservoir Architecture for Spatiotemporal Chaos

Optimizing five hyperparameters on the Kuramoto-Sivashinsky task reveals conserved modularity and efficiency constraints that improve foreca

abstract click to expand

Biological systems maintain function in fluctuating environments by transforming past stimulation into internal dynamical states that support future-oriented responses. Reservoir computing provides a computational analogue, but standard formulations often treat the recurrent substrate as a fixed random network and train only the readout. Here we ask how the substrate itself changes when reservoir architecture is placed under evolutionary selection for prediction. Using the Kuramoto--Sivashinsky equation as a testbed for spatiotemporal chaos, we evolved reservoirs over five construction hyperparameters: size, connectivity degree, spectral radius, input scaling, and readout regularization. Evolution reduced prediction error at the population level, extended the low-error forecast horizon, and organized the design space along a diminishing-return size--efficiency frontier. Structural analyses showed that evolved reservoirs remained within a conserved stochastic-block-model-like spectral envelope while refining low-eigenvalue modes, locking modularity to an intermediate band, and pruning connection cost within that band. Pareto analysis showed that elite reservoirs occupied a horizontal floor in the cost--modularity plane, indicating that accuracy and efficiency were achieved jointly rather than through a simple trade-off. These findings show that evolutionary optimization does not merely improve prediction, but exposes interpretable structural constraints on the recurrent substrate: it stabilizes a task-suitable dynamical class and refines the architectural degrees of freedom most relevant for prediction. Evolutionary reservoir computing therefore provides a bio-inspired framework for studying how predictive demands shape adaptive dynamical networks.

0

cs.LG 2026-06-23

Highway error paths let predictive coding reach 128 layers

by Amirhossein Mohammadi, Alexander G. Ororbia

Error Highways: Scaling Predictive Coding to Very Deep Networks

Linear feedback matrices supply depth-independent corrections so accuracy stays stable on MNIST and Fashion-MNIST.

abstract click to expand

Predictive coding networks (PCNs) offer a biologically-plausible, local-learning alternative to back-propagation of errors (backprop). Nevertheless, they have remained largely confined to shallow architectures and evaluated on simple machine intelligence benchmarks. A central obstacle to scaling PCNs is that the learning signal decays rapidly as it propagates away from the clamped boundaries, leaving interior layers effectively unchanged. To directly counter this problem, we propose highway error propagation (HEP), a scheme that augments the free energy function underlying predictive coding (PC) by altering its neural structure with feedback matrices $V_{L\to i}$ that couple selected hidden states directly to the clamped output error. Since this coupling is linear in the hidden state, the highway pathway delivers a correction at every inference step whose magnitude is independent of depth, in contrast to vanilla PC where the output error reaches the $i$-th hidden layer with attenuation that decays exponentially in depth. This bypasses the Jacobian chain while preserving the local PC synaptic update rule. On MNIST and Fashion-MNIST, we show that HEP effectively trains MLPs of up to 128 layers with accuracy that is robust with respect to depth.

0

cs.ET 2026-06-22

Four IP blocks share one interface for neuromorphic sensing and learning

by Poornima Kumaresan, Santhosh Sivasubramani

Design and Development of a Neuromorphic Silicon Suite: PVT Sensing, Stochastic LIF Inference, On-Chip STDP Learning, and Crossbar Programming

PVT sensor, stochastic neuron, STDP controller and crossbar driver all use the same SPI register file in 130 nm CMOS

abstract click to expand

Edge neuromorphic systems need compact, configurable hardware that combines probabilistic inference, local learning, and an interface to emerging analogue memory. We present four interface-compatible digital IP blocks implemented as standard-cell CMOS on the SkyWater 130 nm process: a process, voltage and temperature (PVT) sensor built from five selectable ring oscillators that also provides a jitter-based true-random-number generator and a frequency-bounds health monitor; a stochastic leaky integrate-and-fire (LIF) neuron with a configurable LFSR, a programmable activation table, and a refractory period; an on-chip spike-timing-dependent plasticity (STDP) controller with a programmable curve and reward-modulated, eligibility-trace, and anti-Hebbian modes; and a memristive-crossbar controller supporting forming, set, reset, read, and automated current-voltage sweep with current-compliance limiting and half-select biasing. All four blocks share a common serial peripheral interface (SPI) register file; the sensor also exposes a parallel readout. Each occupies a single tile at a 50 MHz target. The suite was verified with 99 cocotb tests at register-transfer and gate level (all passing) and taken through an open standard-cell flow, then submitted for tapeout via the Tiny Tapeout shared-silicon programme. Mapped to the open cell library, each block occupies a post-synthesis cell area of 9.3 to 10.6 thousand square micrometres, places at 61 to 70 per cent tile utilisation, meets the 50 MHz constraint with positive setup and hold margin after clock-tree synthesis, and draws an estimated 0.64 to 0.70 mW under a default switching-activity assumption. The contribution is a coherent, openly released set of building blocks unified by one register interface and one verification flow. All results are from simulation and the implementation flow; no fabricated silicon is reported.

0

cs.AR 2026-06-22

Memristor crossbar supports multi-level analog weights for on-chip LLMs

by David Alejandro Trejo Pizzo

Multi-Level Resistive Synapses for On-Chip Neural Networks: A Physics-Based Design of a Memristive Crossbar Fabric with Quasi-Continuous Conductance States

Physics-derived conductance states enable in-memory inference and learning with projected efficiency gains orders of magnitude above CPUs fo

abstract click to expand

Building on resistive communication, this paper presents a physics-based design of an on-chip neural network with multi-level memristive synapses supporting a dense spectrum of conductance states. Derived from ionic transport physics, we develop a state-variable model and quantify storable sub-levels under thermal noise, drift, and quantized conductance. We assemble these devices into a 1T1R crossbar fabric, derive the linear algebra of analog vector-matrix multiplication (VMM) under wire resistance, and design a differential synapse for signed weights. A multilayer pipeline executes inference, backpropagation, and weight updates physically in the analog domain. We derive the in-situ outer-product learning rule, its discretization onto the conductance lattice, and the resulting quantization noise. We provide energy, area, capacity, and inter-tile models, showing this substrate is ideally suited for large language models (LLMs). Our design eliminates weight movement, surpassing binary ReRAM and traditional CMOS. We detail the material stack (HfO_2-based), the FEOL/BEOL CMOS foundry-integration flow, a self-contained SPICE model, the complete memristive-FPGA neuromorphic system, and an in-memory self-attention engine with current-mode translinear softmax. Finally, a ternary BitNet datapath shows projected per-token efficiency orders of magnitude better than advanced CPUs or GPUs. The result is a self-contained hardware-native blueprint for a high-density, analog, in-memory neural processor.

0

cs.NE 2026-06-22

ANN-CANN hybrid stabilizes visual tracking on nine benchmarks

by Yancheng Zhou, Hanle Zheng +2 more

A Theory-grounded Hybrid Neural Network Integrating Complementary Estimation Mechanisms for Stable Visual Object TrackingA

Shared-state alignment lets unbiased ANN estimates correct lagged CANN estimates, preserving gains under occlusion and blur.

abstract click to expand

Hybrid neural networks (HNNs) that integrate artificial neural networks (ANNs) with brain-inspired neural networks have achieved broad success across perception and control tasks. However, much of the current success is confined to neuron-scale hybridization, where discrete, spike-based coding fundamentally limits applicability to continuous-state estimation tasks. In neuroscience, continuous attractor neural networks (CANNs) represent continuous states through neural ensembles, pointing to a population-scale route for HNNs to address this limitation. Yet, principled methodologies for ANN-CANN integration remain largely underexplored. In this work, we propose a theory-grounded ANN-CANN hybridization framework and instantiate it as a hybrid tracking neural network (HTNN) for visual object tracking, a representative continuous-state estimation task. The framework aligns ANN response maps with CANN dynamics in the same state space, enabling the two heterogeneous branches to interact through the shared state representation. Furthermore, we uncover a functional bias-variance complementarity: data-driven ANNs provide asymptotically unbiased estimates, while CANN estimates are low-variance but temporally lagged. By operationalizing this complementarity, HTNN achieves stable and accurate tracking across nine visual tracking benchmarks, consistently outperforming single-network baselines and existing hybrid models. Notably, these performance gains are robustly maintained even under diverse environmental variations, including occlusion, motion blur, and background interference. Through this proof-of-concept study, our framework offers a generalizable foundation for advancing HNNs toward population-scale hybridization.

0

cs.NE 2026-06-22

Three LLM agents create evolving culture in decaying store

by Simon Jones, Sabine Hauert

Emergent Culture in Minimal LLM Systems

Minimal collectives develop storage strategies and long-range coherence beyond message decay, without top-down design.

abstract click to expand

What happens when LLM agents operate with no context outside a turn, minimal prompting, and simple tools? Inspired by swarm engineering, we give collectives of three agents the ability to send messages and manipulate a shared actively decaying text store, introducing evolutionary pressure. The agents spontaneously cooperate, develop storage management strategies, and generate complex evolving cultural artifacts, with no top-down engineering. Using tools from dynamical systems analysis, we show that these behaviours exhibit structured long-range coherence beyond the entropy horizon of the decaying store, consistent with emergent culture in the Sperberian sense.

0

cs.LG 2026-06-22

Large learning rates minimize steps to target accuracy

by Riccardo Poli, Ahmet Yilmaz

Gradient-Descent Steps to Success over Mean Accuracy: A Paradigm Shift for ML

Evaluating models by gradient descent steps instead of peak accuracy shows large rates reduce effort and trigger a switch to many short rest

abstract click to expand

Traditional evaluation of machine learning (ML) models typically focuses on achieving the maximum possible accuracy irrespective of the computational cost. In this article, we propose a paradigm shift towards evaluating performance based on computational effort-explicitly defined here as the total number of gradient descent steps required to reach an acceptable level of accuracy with high probability. Building upon the concept of computational effort originally introduced by Koza for Genetic Programming, we extend this metric to any ML model trained via gradient descent. Furthermore, we demonstrate that minimising this effort acts as a novel form of Automatic Machine Learning (AutoML). By evaluating it across 11 diverse ML models and five standard classification datasets, we uncover significant insights into the dynamics of gradient-based learning. Our findings reveal that optimal hyper-parameters consistently favour unusually large learning rates. Crucially, we demonstrate that the rapid, aggressive landscape traversal enabled by these large rates not only promotes generalisation-as seen in phenomena like superconvergence-but also statistically minimises the expected computational effort for training. Furthermore, we identify distinct phase transitions in the optimal search strategy: while a single training run suffices for lower accuracy targets, reaching a model's performance limit requires a dramatic shift towards conducting numerous independent, short restarts. Finally, we illustrate how this effort-based paradigm provides a robust framework for model selection, allowing practitioners to choose optimal algorithms based on the difficulty of a problem as perceived by different models for a given target accuracy, or to maximise the achievable accuracy for a fixed budget of gradient descent steps.

0

cs.NE 2026-06-22

Island GP with CV fitness recovers compact models on 24-row data

by Artem Andrianov (Cyntegrity Germany GmbH, Hofheim am Taunus +1 more

Evolutional Math: Cross-Validated Island-Model Genetic Programming for Interpretable Symbolic Regression on Small, Wide Datasets

Four design choices yield R-squared above 0.99 using tens of thousands of evaluations on synthetic and clinical small-wide sets.

abstract click to expand

Symbolic regression via genetic programming routinely fails on small, wide datasets - a regime common in clinical-trial monitoring, biostatistics, and engineering pilot studies - by converging on bloated, overfit expressions that exploit correlation rather than prediction. We present Evolutional Math, an open-source genetic programming system that combines four design choices to yield compact, interpretable formulas in this regime. First, fitness is measured by R-squared on held-out cross-validation folds rather than Pearson correlation on the training set, eliminating single-variable shortcuts that correlate but mis-scale. Second, a multi-island architecture runs independent populations seeded with distinct operator subsets (algebraic, logarithmic, trigonometric, and full) with ring-topology migration every M generations, preventing the search from collapsing into one region of formula space. Third, a structural deduplication scheme treats formulas differing only in constants as equivalent, so the elite archive contains structurally distinct candidates rather than near-duplicate variants. Fourth, top-k individuals undergo numerical constant refinement via scipy L-BFGS-B after each migration phase, decoupling structure search from parameter fitting. We evaluate the system on synthetic benchmarks of the form log(x_i) * x_j / (x_k * c), trigonometric mixtures, and an anonymized clinical site-monitoring dataset with 24 rows and approximately 290 candidate numeric features. The system consistently recovers compact ground-truth structures with R-squared at or above 0.99 within tens of thousands of unique formula evaluations. A reference implementation is released under a noncommercial source-available license.

0

cs.NE 2026-06-22

Hypernetworks generate sparse modular reservoirs for temporal tasks

by Mani Hamidi, Sina Khajehabdollahi +2 more

Distilling a Modular Reservoir Through a Genomic Bottleneck

A compressed generative model produces functional connectivity that solves difficult time-based problems with little extra training.

abstract click to expand

The intricate structures of biological neural networks largely emerge during development, guided by a comparatively compressed blueprint encoded in the genome. The connectivity that emerges from this decoding process is rich in structure, and already equips the organism with functional modules upon birth. This initial structure serves as a scaffold that can be gradually refined and fine-tuned through lifelong experience, via a variety of plasticity mechanisms. Drawing inspiration from this interaction between evolutionary and developmental modes of learning, we use hypernetworks to learn a compressed generative process that generates the connectivity of a modular reservoir. We show that this marriage between curriculum-based meta-learning and modular reservoir computing can generate sparse recurrent networks that solve difficult temporal tasks with minimal training and without concessions to robustness.

0

cs.NE 2026-06-22

Minimal spiking net generates stable soliton waves in 2D

by Ch. Meessen

Soliton-like Waves in a Two-Dimensional Recurrent Spiking Neural Network with Weighted Spike-Timing-Dependent Plasticity

Asymmetric excitatory-inhibitory radii and weighted STDP produce self-propagating packets that annihilate on collision and encode source pha

abstract click to expand

We construct a minimal but biologically plausible spiking neuron model operating in discrete time, combining multiplicative spike-timing-dependent plasticity (WSTDP), divisive normalization of synaptic integration, homeostatic threshold adaptation, and a one-step refractory period. We show that this normalization admits a biologically plausible dendritic implementation in which each binary junction operates using only locally available information. Assembling excitatory-inhibitory pairs of such neurons into a two-dimensional recurrent network and applying periodic localized stimulation, we find that the network spontaneously gives rise to stable, self-propagating wave packets with the properties of dissipative solitons: they maintain a stable spatial profile, propagate at constant speed, and annihilate upon frontal collision. Their emergence requires a geometric asymmetry between excitatory and inhibitory connection radii, and initial inhibitory synapses stronger than excitatory ones. WSTDP engraves the direction of propagation into the synaptic weight profile, so that the network learns by itself to sustain propagation in one direction while suppressing the reverse. When two sources are active simultaneously, the resulting waves annihilate upon collision, defining a semi-persistent boundary whose position encodes the relative phase and frequency of the two sources. These results provide a minimal computational framework for studying the emergence of cortical traveling waves, activity zone delimitation, and spatial memory from local plasticity rules alone.

0

cs.LG 2026-06-22

Warm-start library reduces amortized cost to O(KD/ε²+(R-K)logK/Δ²)

by Jianwei Lou (RailMind Systems, Neuss +1 more

Gradient-Free Warm-Start Library Recovery: an Amortized-Regret Separation

Recognition costs O(log K/Δ²) independent of dimension while estimation costs Θ(D/ε²), giving advantage linear in recurrences and D when seg

abstract click to expand

Continual learning that is gradient-free, local, online, and append-only is attractive for edge and streaming deployment, but its value is usually argued informally. We give a provable account on recurring-regime streams. Given segmentation, a warm-start library learner attains amortized recovery cost $O\!\big(KD/\varepsilon^2+(R-K)\logK/\Delta^2\big)$ versus a memoryless re-estimator's $\Theta(RD/\varepsilon^2)$, an advantage $(R-K)\,\Theta(D/\varepsilon^2)$ growing with dimension $D$ and recurrence density. The mechanism is a decoupling: recognizing which of $K$ seen regimes is active costs $O(\log K/\Delta^2)$, independent of $D$, whereas estimating a regime costs $\Theta(D/\varepsilon^2)$. We prove this is tight: matching lower bounds give recognition $\Theta(\log K/\Delta^2)$ and a memoryless-class bound $\Omega(RD/\varepsilon^2)$, so each term is individually minimax-tight (the joint statement is conditional). The separation is born-immune (a memoryless learner's advantage is identically zero) and paradigm-level: it matches, and does not beat, a fair spawn-capable Bayesian baseline; the contribution is attaining this cost structure without end-to-end backprop and with zero forgetting by construction. A count-calibrated variant ties the baseline's leading constant up to a bounded, never-negative per-recurrence overshoot, hyperparameter-free and with no per-step transcendentals. We bound the scope: recognizable regimes are capped by simplex packing (walls $e^{\Theta(D)}$); autonomous segmentation is impossible at the packing wall (no detector escapes the false-alarm/delay frontier as regimes overlap); the advantage vanishes under overlap. The dimension-dependent separation is corroborated on synthetic streams and real $k$-mer genome distributions (memoryless cost $\propto D^{1.04}$, recognition $D$-independent); the one real sequential stream sits in the $D{=}1$ near-null corner.

0

cs.NE 2026-06-22

Multiple offspring improve diversity optimisation with adapted selection

by Adel Nikfarjam, Jakob Bossek +2 more

On the Use of Survival Selection Methods for Evolutionary Diversity Optimisation

Because each solution's contribution depends on the rest of the set, standard replacement fails for simultaneous updates; tailored survival

abstract click to expand

Generating a diverse set of high quality solutions for an optimisation problem has been studied extensively in recent years by the evolutionary computation community. A paradigm that has received increasing attention is evolutionary diversity optimisation (EDO), where the goal is to maximise the diversity of a solution set subject to quality constraints. Since the contribution of each solution to the diversity of the population depends on other solutions and can change dramatically if several solutions in the population are modified simultaneously, most EDO approaches generate a single new solution per generation and discard the solution with the least contribution to diversity, ensuring a steady increase in population diversity over successive generations until convergence. In this study, we aim to answer two questions: (1) Is generating multiple solutions in each generation beneficial for EDO? (2) How can this be achieved efficiently, given that conventional survival selection methods do not work well in EDO due to the dependency of a solution's contribution to diversity on other solutions?

0

cs.LG 2026-06-19

Evolved reward schedules lift RL returns 11 percent on navigation tasks

by Alan Nadelsticher Ruvalcaba

Evolutionary Discovery of Developmental Reward Schedules in Deep Reinforcement Learning

Search over time-varying agency, novelty, and reactivity weights beats fixed extrinsic rewards on two MiniGrid environments.

abstract click to expand

The temporal structure of reward composition in reinforcement learning (RL) is typically hand-designed and held fixed throughout training, leaving the progression of motivational priorities largely unexplored. In this work, we propose an evolutionary framework for discovering developmental reward schedules, in which three distinct biologically inspired motivational components -- agency, novelty, and reactivity -- are combined through time-varying weights that dynamically shift over the course of training. Evaluated on two sparse-reward MiniGrid tasks: DoorKey-6x6 and KeyCorridorS3R1, our framework compares the generalizability of four evolutionary algorithms: CMA-ES, xNES, DE, and L-SHADE against an extrinsically motivated baseline (our main comparison point), and three additional hand-designed methods. On DoorKey-6x6, all evolved methods outperform the non-evolved baselines, with L-SHADE achieving the best performance -- an approximate relative mean improvement of 11.4% over the extrinsic only baseline. On KeyCorridorS3R1, CMA-ES achieves the best overall performance, with the remaining evolved methods showing weaker and less reliable generalization capability compared to the extrinsic only baseline. Interestingly, the discovered schedules diverge from our defined developmental ordering, with novelty consistently emerging as the dominant early signal during training, across both tasks. Collectively, our results position evolutionary optimization as a promising approach for developmental reward schedule discovery in deep reinforcement learning, and suggest that what evolution finds to be optimal in computational settings may differ from what it finds to be optimal in biology. The code for this project can be found at: https://github.com/alannadels/Evolutionary_RL.git.

0

cs.NE 2026-06-19

NEOL proves sublinear regret for two-timescale neuroevolution

by Shishen Lin, Yixin Chen

Provably Sub-Linear Two-Timescale NeuroEvolution with Online Plasticity

Decoupling architecture search from reward-modulated weight adaptation yields the first regret bound and better results than pure NEAT on co

abstract click to expand

NeuroEvolution of Augmenting Topologies (NEAT) is a widely used neuroevolution algorithm for learning neural network architectures and weights for control tasks. However, standard offline optimisation searches for connection strengths directly, which can scale poorly in high-dimensional weight spaces and more difficult continuous control problems. Hybrid methods that combine neuroevolution with online learning can address this challenge, but their theoretical properties remain underexplored. This paper gives the first regret analysis for a general NeuroEvolutionary Online Learning (NEOL) framework, which decouples learning into two timescales: an outer loop for architecture search and an inner loop for online weight adaptation via rewardmodulated plasticity. Under mild conditions, we prove that NEOL achieves sublinear regret. Empirically, under fixed interaction budgets on four standard control benchmarks, a NEAT-based NEOL implementation achieves higher final fitness and lower variance than pure NEAT, and is competitive with strong reinforcement learning (RL) baselines on several tasks. The results are supported byWilcoxon rank-sum tests and ablation studies. Overall, the findings show that online plasticity can improve the sample efficiency and robustness of two-timescale neuroevolution. Code is available at https://github.com/boobaa2001/NeuroEvolution Online Learning NEOL.

0

cs.SE 2026-06-19

Formal verification step forces correct JSON-to-FHIR translators

by Colin Samplawski, Adam D. Cobb

Formally Verified Code Synthesis for Structured Data Translation in a Medical Internet of Things

LLM evolutionary synthesis plus schema checks produces reliable device integration code for medical networks.

abstract click to expand

In this work we present a LLM powered, evolutionary code synthesis system for structured data translation in a Medical Internet of Things settings. A key challenge in this domain is ensuring that the synthesized code is trustworthy and reliable. To this end, we integrate a formal verification step into our code synthesis pipeline to ensure that any generated code is guaranteed to satisfy predefined requirements. In particular, we present a case study of integrating a novel device (a pulse oximeter) into the existing network of devices. Our system generates a formally verified translation between the device's JSON schema and the Fast Healthcare Interoperability Resources (FHIR) format used by the wider system. This formal verification stage ensures structured data translated by the generated code will always be in the target output schema. We provide a set of experimental results which demonstrate that our system is able to consistently generate correct translation at low cost.

0

cs.AI 2026-06-19

Models with equal accuracy can violate logical rules differently

by Guillaume Olivier Delplanque, Pierre Genevès (LIG) +3 more

Beyond Accuracy: Measuring Logical Compliance of Predictive Models

The Rule Violation Score counts breaches of domain constraints that accuracy metrics overlook.

abstract click to expand

Machine learning models are predominantly evaluated through predictive performance metrics such as ranking quality, prediction error, or classification accuracy. While these metrics effectively quantify how closely predictions match the ground truth, they do not assess whether model outputs respect predefined logical or domain-specific constraints. In high-stakes applications, including healthcare, finance, and autonomous systems, logical consistency can be as critical as predictive accuracy, yet no standard metric captures this dimension. We introduce the Rule Violation Score (RVS), a complementary evaluation metric that quantifies the extent to which a predictive model respects a given set of logical rules, independently of predictive accuracy. RVS treats hard rules (strict constraints) and soft rules (statistical regularities) differently, can be evaluated on any dataset and on any predictive model expressed over a relational vocabulary, and can be computed using SQL queries that are automatically generated for Horn rules. Beyond evaluating models, RVS can also evaluate the logical consistency of training datasets and help identify poorly defined rules. We evaluate RVS on three benchmarks covering knowledge graph link prediction and relational regression, including rule-based, embedding-based, and neuro-symbolic predictive models. Our results demonstrate that two models achieving comparable predictive accuracy can exhibit substantially different levels of logical compliance, revealing differences in model behavior that standard metrics fail to capture.

0

cs.NE 2026-06-19

Rate-coded SNN reaches 99.09% on 64-class ImageNet using local rules

by Denis Larionov, Khairutin Shtanchaev +3 more

Hybrid ANN-SNN Pipeline with Local Plasticity

Pretrained ANN embeddings are converted to spikes and classified by an SNN trained without end-to-end backpropagation.

abstract click to expand

This work proposes a hybrid ANN-SNN pipeline that effectively leverages the rich embeddings of pretrained artificial neural networks (ANNs) to enable high-performance spiking neural networks (SNNs). The architecture couples a pretrained EfficientNet encoder with a CoLaNET spiking classifier. We convert the encoder's activations into spike trains via rate-coding and train the subsequent SNN classifier using local, biologically inspired learning rules, bypassing end-to-end gradient propagation. This approach achieves 99.09% accuracy on a 64-class ImageNet benchmark, demonstrating performance on par with conventional deep networks. The work presents a biologically plausible and efficient framework for adapting powerful pretrained encoders to downstream spiking neural network tasks.

0

cs.NE 2026-06-19

Weight adaptation lifts ASNG results on binary problems

by Yutaro Yamada, Kento Uchida +1 more

Weight Adaptation for Improving Parallel Performance of Adaptive Stochastic Natural Gradient

Maximizing estimated signal from gradient accumulations lets WA-ASNG beat baselines at populations 25-100 and under noise.

abstract click to expand

Probabilistic model-based evolutionary algorithms are promising for black-box optimization. Specifically, the adaptive stochastic natural gradient (ASNG) adaptively updates its learning rate, a typical hyperparameter in probabilistic model-based evolutionary algorithms, thereby realizing efficient and robust optimization. Although weight parameters are common hyperparameters, with the increasing demand for parallel evaluation of time-consuming tasks, it remains unclear how to set suitable weights for larger population sizes. In this paper, we propose Weight Adaptation ASNG (WA-ASNG), which incorporates a weight adaptation mechanism into ASNG. We calculated the estimated signal of the update direction from the accumulations of the natural gradient. Then, to maximize the signal, WA-ASNG adaptively updates its weight parameters by a gradient ascent over the optimization. While the learning rate adaptation plays a role in satisfying a sufficient condition for monotonic improvement of the expected objective value, the mechanism of weight adaptation is intended to maximize this improvement. The experimental results demonstrate that WA-ASNG outperforms PBIL and ASNG across various settings with population sizes ranging from 25 to 100 for binary optimization problems. Furthermore, WA-ASNG can perform efficiently in the presence of strong noise. Our code is available at https://github.com/shiralab/WA-ASNG .

0

cs.NE 2026-06-18

Learnable encoder reaches 94.97% on speech commands with 35k parameters

by Taharim Rahman Anon, Jakaria Islam Emon

Adaptive Speech-to-Spike Encoding for Spiking Neural Networks

The compact model matches larger baselines by producing task-specific spikes instead of reconstructing the waveform.

abstract click to expand

The mismatch between continuous acoustic signals and discrete event-driven processing remains a fundamental bottleneck for neuromorphic speech processing. Current systems typically rely on fixed spike encoders, forcing downstream Spiking Neural Networks (SNNs) to compensate for non-adaptive input representations. To address this, we present a learnable residual speech-to-spike encoder jointly trained end-to-end with a Recurrent Leaky Integrate-and-Fire (R-LIF) backbone. We validate this approach on the Google Speech Commands v2 (GSC-v2) benchmark, achieving up to 94.97% accuracy. Notably, the learned encoder remains highly parameter-efficient with a compact 35k-parameter variant that reaches 89.8%, matching or exceeding prior baselines that require an order of magnitude more parameters. Our encoder-focused analysis, including linear probing and gradient-residual inspection, indicates that the encoder does not target faithful signal reconstruction but instead learns task-aligned spike representations that enhance class separability. Finally, we benchmark bio-inspired, hardware-friendly credit assignment by comparing Direct Feedback Alignment (DFA) with surrogate-gradient BPTT under identical architectures and training conditions. We find that DFA reaches 91.5% accuracy, quantifying the performance trade-off of bio-inspired learning rules for modern neuromorphic audio.

1 0