pith. sign in

q-bio.MN

Molecular Networks

Gene regulation, signal transduction, proteomics, metabolomics, gene and enzymatic networks

0
math.PR 2026-06-30

Phosphorylation model admits regime with two stable equilibria

by Lucie Laurence, Philippe Robert

Thermodynamic Limits of Stochastic Chemical Reaction Networks with Phosphorylation

With substrate fixed at N and enzymes scaling with N, specific catalytic constants yield three equilibria of which two are stable.

Figure from the paper full image
abstract click to expand
In this paper we investigate the stability properties of a fundamental mechanism of biological cells called phosphorylation. The system is a chemical reaction network (CRN) for which a chemical species, {\em the substrate}, can be sequentially transformed into two phosphorylated forms, by the activity of two types of enzymes, one type for phosphorylation, the other for dephosphorylation. We investigate a stochastic representation of this model, under the mass action kinetics. The total mass of the substrate is fixed at $N$, while the total mass of enzymes scales proportionally to $N$. The asymptotic behavior, when $N$ is large, of the concentrations of all chemical species is studied. We investigate the possible {\em stable} subsets of chemical species for the kinetics of the law of mass action. A stable subset is such that, with a convenient initial state, the number of copies of the species of this subset remains $O(1)$ on any finite time interval as $N$ gets large. The role of the twelve reaction rate constants, {\em the catalytic constants} of the CRN, is investigated from this point of view. An averaging principle of the corresponding Markov process is established for several regimes of the CRN. It is shown in particular that there exists a regime with three equilibrium points, with two of them stable. The proofs of the results rely on stochastic calculus with Poisson processes, convenient couplings of subsets of coordinates of the Markov process, technical results on $M/M/\infty$ queues, and a stability analysis of a dynamical system in $\mathbb{R}_+^4$.
0
0
math.DS 2026-06-29

Generic parameters restrict exact lumping to obvious reductions

by Justin Eilertsen, Valery G. Romanovski +2 more

Lumping of reaction networks: Generic and critical parameters

Only elimination of non-reactants or projections along integrals survive for open sets of parameters; algorithms locate the special critical

Figure from the paper full image
abstract click to expand
We investigate linear lumping for parameter-dependent mass action reaction networks, distinguishing between generic and critical parameter regimes. For generic parameters -- those ranging in some non-empty open subset of parameter space -- we prove that exact linear lumping yields only "obvious" reductions: elimination of non-reactant species or projections along stoichiometric first integrals. This characterization extends to reaction networks with product-form kinetics, including Michaelis-Menten and Hill-type rate laws. For mass action systems we proceed to develop an algorithmic approach to identify critical parameter sets -- algebraic subvarieties in parameter space where non-trivial lumpings become available. This procedure reduces the determination of lumping maps to a system of finitely many polynomial equations. It also applies to constrained lumping scenarios (which are frequently motivated by chemical considerations). We then review and extend results about proper lumpings. Finally, we discuss lumpings of a self-replicator system, and of a two-pathway enzyme mechanism, to document the viability of our methods in relevant scenarios. Our results clarify the relationship between structural (parameter-independent) and fine-tuned (parameter-dependent) reductions, with implications for approximate lumping when system parameters lie near critical values
0
0
q-bio.MN 2026-06-24

Scale-splitting extracts dominant pathways and analytical formulas from reaction networks

by J. Unterberger, U. Herbach +1 more

Hierarchical models for large chemical reaction networks

Recursive coarse-graining yields simplified graphs and rate expressions accurate under scale separation in dilute regimes, enabling rate inf

Figure from the paper full image
abstract click to expand
The quest for the origin of life, especially in the metabolism-first scenario inspired by the celebrated Miller-Urey experiment, has triggered a research program dedicated to studying the emergence of complex dynamical behaviors in large chemical mixtures. Though autocatalysis, understood as the capacity of a reaction network to grow exponentially, has been recognized as a potential driver of instability and multistability, no quantitative theory has yet emerged, partly because of the lack of available kinetic data. We introduce a computational tool for large chemical reaction networks based on a scale-splitting algorithm inspired by Wilson's renormalization group. We focus on dilute regimes, where species of interest have low concentration, non-unimolecular reactions may be neglected, and the dynamics is close to linear. Depending on parameter thresholds, such networks can exhibit autocatalytic behavior. Our algorithm takes as input a network structure and outputs (1) a simplified effective graph containing the dominant reaction pathways, obtained through recursive coarse-graining; and (2) analytical formulas for the dynamics in terms of kinetic rates, called hierarchical formulas. These formulas are approximate but interpretable, accurate when scale separation is effective, and provide a reliable multiscale description of the dynamics. Their domains of validity define kinetic phases, each typically associated with a distinct pattern of chemical composition. We show on a simple example that this approach enables fast and reliable inference of kinetic rates from concentration time series. Hierarchical formulas have been implemented as a Python package and are illustrated on a simplified model of the formose reaction.
0
0
cs.LG 2026-06-23

Attention-free memory cuts Koopman rollout errors over 1000 steps

by Mohammed Nagdi, Evangelos-Marios Nikolados +4 more

Learning the Koopman Operator using Attention Free Transformers

AFT block plus change-point re-encoding keep predictions on the manifold longer than plain Koopman or multi-head attention on three benchmar

Figure from the paper full image
abstract click to expand
Learning Koopman operators with autoencoders enables linear prediction in a latent space, but long-horizon rollouts often drift off the learned manifold, leading to phase and amplitude errors on systems with switching, continuous spectra, or strong transients. We introduce two complementary components that make Koopman predictors more robust. First, we add an attention-free latent memory (AFT) block that aggregates a short window of past latents to produce a corrected latent before each Koopman update. Unlike multi-head attention, AFT operates in linear time and adds only $\approx$30k parameters ($3d^2 + T^2$, fewer than matched multi-head attention), yet captures the local temporal context needed to suppress error divergence. Second, we propose dynamic re-encoding: lightweight, online change-point triggers (EWMA, CUSUM, and sequential two-sample tests) that detect latent drift and project predictions back onto the autoencoder manifold. Across three benchmark systems -- Duffing oscillator, Repressilator, IRMA -- our model consistently reduces error accumulation compared to a Koopman autoencoder and matched-capacity multi-head attention. We also compare against GRU and Transformer autoencoders, evaluated both from initial conditions and with a 50-step context, and find that Koopman+AFT (with optional re-encoding) attains markedly lower long-horizon error while maintaining lower inference latency. We report improvements over horizons up to 1000 steps, together with ablations over trigger policies. The result is a fast, compact predictor that stays on the learned manifold over long horizons.
0
0
q-bio.MN 2026-06-23

Output networks buffer core clock period variability

by Ismail M Nur, Hotaka Kaji +2 more

Circadian output network can buffer period variability

Simulations show serial downstream paths actively dampen timing fluctuations rather than merely transmitting them.

Figure from the paper full image
abstract click to expand
Circadian rhythms are biological oscillations that govern 24-hour physiological and behavioral processes across most organisms. Recent bioimaging studies have revealed that even individual cells can exhibit circadian rhythms. The period of cellular oscillations can fluctuate due to molecular noise in the circadian clock machinery. Whether regulatory networks downstream of the clock amplify or attenuate clock-derived period fluctuations remains poorly understood. In this study, we numerically observed period variability in a self-sustained oscillator coupled to an output network. Our numerical calculations demonstrated that a serial pathway does not merely relay timing signals but actively shapes rhythmic reliability. The extent of this reduction depended on parameters of both the clock and output systems. For more complex output networks, the shortest-path length from the core oscillator was a major determinant of increased oscillation precision. This noise-buffering effect saturated in long cascades. These results suggest the existence of an intrinsic precision-enhancing mechanism embedded within circadian output networks.
0
0
cs.LG 2026-06-19

Decomposing molecules into atom-pair subgraphs improves predictions

by Trung Nguyen, Duc Duy Nguyen

MMGNN: Multi-level, multi-color graph neural networks for molecular property prediction

MMGNN processes overlapping subgraphs separately before aggregation, leading to top scores on several MoleculeNet tasks.

Figure from the paper full image
abstract click to expand
Molecular message-passing neural networks commonly propagate chemically diverse interactions through a single graph, which may mix interaction-specific signals and require deep propagation to capture long-range effects. We introduce the Multi-level, Multi-color Graph Neural Network (MMGNN), a hierarchical framework that decomposes a molecular graph into overlapping atom-type-pair-specific subgraphs while preserving atom-level resolution. MMGNN-2D constructs chemical-colored subgraphs from covalent connectivity, whereas MMGNN-3D constructs geometric-colored subgraphs from spatial proximity and augments their edges with distance, angular, and torsional descriptors. Both variants apply a shared communicative message-passing backbone to each subgraph and combine the resulting representations through atom-wise aggregation and molecular readout. We evaluated MMGNN on five classification and three regression benchmarks from MoleculeNet using common scaffold splits and five independent runs. MMGNN-2D achieved the highest macro-average AUC-ROC of 0.838 across the classification datasets and the lowest RMSE on ESOL (0.803). MMGNN-3D obtained the highest mean AUC-ROC on BBBP (0.956) and the lowest RMSE on FreeSolv (1.793), indicating complementary strengths of topological and geometric representations. Structural and leave-one-out analyses further illustrate how the subgraph decomposition affects learned representations and atom-type-pair sensitivities. These results support overlapping interaction-specific graph decomposition as a competitive strategy for molecular property prediction.
0
0
q-bio.MN 2026-06-19

Noise creates Turing patterns in stable gene networks

by Manuel Eduardo Hernández-García, Jorge Velázquez-Castro

Oscillations and Spatial Patterns in Large-Scale Stochastic Gene Regulatory Networks

Molecular fluctuations drive spatial instability even with uniform diffusion rates in cyclic negative-feedback systems.

Figure from the paper full image
abstract click to expand
Gene regulatory networks (GRNs) are fundamental to cellular growth and tissue formation, orchestrating spatially and temporally regulated gene expression during development. These networks are inherently subject to intrinsic fluctuations arising from molecular noise, making the analysis of their stability essential for understanding robust pattern formation and developmental dynamics of the organism. In this study, we analyze the stability and dynamics of cyclic GRNs with negative feedback and diffusion, considering both deterministic and stochastic approaches. In the deterministic case, the system exhibits a bifurcation between stability and instability, leading to Hopf instability in the absence of diffusion and to Turing-Hopf instability when diffusion is included. It was observed that the discretization of the spatial domain introduces additional unstable modes, enabling a wider range of patterns. The stochastic framework based on the second-moment approach, which incorporates intrinsic fluctuations, reveals that for small system sizes, fluctuations can dominate the dynamics and induce stochastic Turing instability, even when the system is stable in the absence of diffusion. Notably, Turing instabilities can emerge even when all variables have the same diffusion rate. The developed framework provides a systematic method for analyzing the stability of high-dimensional stochastic systems with diffusion, thereby simplifying the prediction of Turing and Turing-Hopf instabilities. These findings contribute to a deeper understanding of the complex dynamics and pattern formation in GRNs, with potential implications for biological processes, such as cellular differentiation and development.
0
0
q-bio.MN 2026-06-16

Division redirects identical cells to opposite fates

by Charli Austin, Nikola Popovic +1 more

Cell Division Changes Fate Decisions in a Genetic Toggle Switch

Analytical separatrices reveal a region where omitting division yields wrong stable-state predictions.

Figure from the paper full image
abstract click to expand
Gene regulatory networks govern cellular fate decisions through multistable dynamics. The genetic toggle switch is a canonical model of such behaviour; yet, the impact of cell division on its dynamics remains poorly understood. We derive analytical separatrices for a simplified Boolean toggle switch with and without division. We show that division can redirect trajectories with identical initial conditions to opposing stable states, and we define a region of disagreement where fate decisions are predicted incorrectly if division is neglected. Our results imply that division can fundamentally reshape fate boundaries in multistable regulatory networks.
0
0
cs.AI 2026-06-12

Genomic profile sets Bayesian prior to separate nature from nurture

by Aruna Dey, Suraj Biswas

Is It You or Your Environment? A Bayesian Inference Framework for Genomically-Anchored Personalized Physiological Interpretation

Fixed genetic anchor distinguishes constitutional from environmental effects from the first measurement onward

Figure from the paper full image
abstract click to expand
Personalized health AI systems face a fundamental cold-start problem: machine learning models for physiological interpretation require weeks of individual behavioral data before they can distinguish constitutional variation from environmentally driven deviation. We propose a solution grounded in causal inference and Bayesian prior design. An individual's genomic profile serves as an exogenous genetic anchor -- a domain-informed, personalized prior that is fixed at conception, immune to reverse causation, and available before a single behavioral observation is collected. The anchor initializes a Bayesian belief state over an individual's physiological set point G-hat = mu + sum(beta_i * g_i), where beta_i are GWAS-derived effect sizes and g_i are risk-allele counts. Each incoming physiological measurement P produces a non-constitutional deviation delta = P - G-hat that separates the signal attributable to environment and state from the constitutionally fixed baseline. As behavioral data accrue, the prior decays according to G-hat_t = w(t)*G-hat_genomic + [1-w(t)]*P-bar_t, transitioning from genome-dominated to empirical-baseline-dominated inference. The same observed HRV of 55 ms generates a suppression hypothesis for a person whose prior predicts 80 ms, and an enhancement hypothesis for a person whose prior predicts 30 ms -- a reversal impossible without a personalized anchor. We develop this architecture across six physiological domains, grading genomic priors by evidence strength, distinguishing robustly replicated anchors (FTO, FADS1/2, FKBP5) from contested candidate genes (SLC6A4, MAOA, DRD2). We address the inference boundary between association, Mendelian randomization, and individual token causation, and define four constraints for deployment: evidence-graded priors, dynamic decay, ancestry-matched effect sizes, and attribution rather than deterministic output.
0
0
q-bio.MN 2026-06-11

Drosophila NMJ release probabilities do not maximize information

by Eitan Goldfein, Sarah Marzen

Predictions for and lack of maximal information transmission in the neuromuscular junction

Theoretical optimum from dose-response curves shows little overlap with measured distribution, indicating other constraints shape the juncti

Figure from the paper full image
abstract click to expand
A key question in theoretical biology is how effectively biological systems preserve information about their inputs while operating under physical and functional constraints. We examine that question at the neuromuscular junction (NMJ) by studying how neurotransmitter concentration is transformed into current at both cholinergic and glutamatergic NMJs. An information maximization analysis was used to derive a theoretical distribution over neurotransmitter concentrations based on biological understandings of dose-response relationships. These theoretical distributions were compared to an experimentally derived distribution obtained from a Drosophila NMJ. The theoretical and experimental distributions showed very little agreement, indicating that the Drosophila NMJ does not shape its distribution of synaptic vesicle release probabilities in order to maximize information transmission from nervous system to muscle. Predictions for cholinergic systems are provided.
0
0
q-bio.MN 2026-06-11

Reaction networks implement linear regression at steady state

by Aryan Kumar, Amey Choudhary +3 more

Implementation of Linear Regression and Linear Interpolation using Reaction Networks

Steady-state concentrations of species encode regression and interpolation outputs via a division module that handles negatives.

Figure from the paper full image
abstract click to expand
Performing statistical inference is an essential component of data science. Our focus in this work is on two inference techniques, viz. regression and interpolation. We propose a reaction network based approach that can implement linear regression (both univariate and multivariate) and linear interpolation. We do this by encoding the steady state concentration of species as the output of these inference techniques. Towards this, we use a novel generalized division module that can handle division of negative numbers. We verify our results by comparing them with in-silico implementation on standard synthetic datasets.
0
0
q-bio.GN 2026-06-11

m6A-FORM predicts m6A sites at PR-AUC 0.635

by Tinghe Zhang, Sumin Jo +2 more

m6A-FORM: A Foundation Model for Decoding N6-methyladenosine Biology

Pretrained on 22 million peak-derived sequences, the transformer model also supports regulator binding prediction and identifies tissue-cons

abstract click to expand
N6-methyladenosine (m6A) is the most abundant internal modification in eukaryotic mRNA. However, most existing predictors use adenosine-centered formulations that are computationally inefficient and prone to false positives. Here we present m6A-FORM, a transformer-based foundation model for RNA methylation that uses MeRIP-seq peaks as methylation-enriched priors and is pretrained on approximately 22 million peak-derived sequences from 143 human MeRIP-seq studies. After fine-tuning with high-confidence single-nucleotide m6A annotations from m6A-Atlas v2.0 and GLORI, m6A-FORM-sites achieves state-of-the-art m6A site prediction performance, with a PR-AUC of 0.635 and ROC-AUC of 0.988, improving PR-AUC by at least 0.14 over existing methods while enabling substantially faster inference. Task-specific adaptation further supports prediction of binding sites for 19 m6A-associated regulators and identification of YTHDF2-bound m6A sites associated with mRNA degradation. Applying m6A-FORM across 67 datasets from 24 human tissues identifies 19,631 tissue-conserved sites with distinct localization, clustering, methylation, expression, RBP-interaction, and decay-associated signatures.
0
0
physics.chem-ph 2026-06-10

Assembly theory sizes drug-like space at 10^117 molecules

by Juan Carlos Morales Parra, Keith Y Patarroyo +3 more

Elucidating the Size of Chemical Space with Assembly Theory

Minimum recursive bond steps yield super- to double-exponential bounds that reach 10^117 at index 25 below 500 Da.

Figure from the paper full image
abstract click to expand
Chemical space is unimaginably vast with common heuristic estimates suggesting that there are ca. 10^60 'drug-like' molecules possible below a molecular mass of 500 Da. However, these estimates largely ignore the structural and synthetic complexity of the molecules enumerated. Here we present a first-principles estimate of the size of chemical space using the Assembly Theory, which quantifies the amount of causation required to form a molecule, captured in the assembly Index. This is a measurable molecular complexity measure derived from the minimum number of recursive bond-joining operations required to construct a molecular graph. Assembly Theory partitions chemical space into levels defined by Assembly Index, allowing bounds to be placed on its growth as molecular complexity increases. We show that chemical space (the accumulated Assembly Index level sets) grows at least super-exponentially, and at most, double-exponentially with respect to the Assembly Index. Using the GDB-13 database as a reference for growth-rate estimation, we model how chemical space expands under increasing complexity and contracts under structural constraints, including atom and bond types, number of rings, ring size, and chemical motifs. Under constraints comparable to standard drug-like estimates, including molecular mass below 500 Da, our analysis yields a chemical space of approximately 10117 molecules at Assembly Index 25. Finally, we constrain chemical space by biologically relevant motifs and identify structurally relevant molecules near the accessible boundaries of these assembly-defined spaces.
1 0
0
physics.chem-ph 2026-06-08

Conformer ensembles cut solvation error 11-13 percent but add nothing else

by Bryan Cheng, Austin Jin +1 more

When Three-Dimensional Conformer Ensembles Improve Molecular Property Prediction Beyond Two-Dimensional Fingerprints: A Systematic Study

Systematic tests on 14 targets show gains only for solvent-dependent tasks, with physical evidence from split type, molecule size, and data

Figure from the paper full image
abstract click to expand
When do three-dimensional conformer ensembles improve molecular property prediction beyond two-dimensional fingerprints? We provide the first systematic, mechanistically grounded answer. Through ~1,000 experiments spanning 13 model configurations, 14 regression targets, and 2 classification targets across MoleculeNet, QM9, and MARCEL benchmarks, we discover selective complementarity: conformer ensemble statistics extracted via Distribution Kernel Operators (DKOs) yield statistically significant RMSE reductions on solvation-dependent properties (ESOL -11.0%, p < 10^{-9}; FreeSolv -13.5%, p < 3x10^{-5}; 10-seed paired validation) while providing no benefit for electronic or steric tasks. Three lines of evidence confirm this selectivity has a physical rather than statistical basis: improvement is larger under scaffold splits than random splits (+11.9% vs. +8.5% on ESOL), concentrates on large, flexible molecules (+18.9% for heaviest quartile), and grows monotonically with training data. We establish a four-tier performance hierarchy: end-to-end 3D GNNs (SchNet, PaiNN; 21-42% over fingerprints) >= engineered physicochemical descriptors (PMI/SASA/USR) > Morgan fingerprints + XGBoost > all neural conformer ensemble methods, confirmed by two architecturally diverse GNNs and revealing that the pre-computed feature bottleneck limits ensemble approaches. Feature attribution and mutual information analysis expose the mechanistic asymmetry: conformer mean features carry 2-8x more information per feature than fingerprint bits, yet covariance features contribute <2% of model signal, explaining why five simple scalar invariants outperform all complex covariance architectures (p < 0.001). These findings yield an empirical property taxonomy and a practical decision framework for when conformer generation is worth the investment.
0
0
cs.LG 2026-06-08

Transformer plus reranker reaches 59.4% top-1 retrosynthesis accuracy

by Raja Sekhar Pappala, Shreyas Vinaya Sathyanarayana +3 more

RETROSPECT: RETROsynthesis via Sequential Prediction, and Chemically Transformed-ranking

Generator alone hits 55% exact match on 5,007 reactions; learned ranking adds four points on pools of 111 candidates each.

Figure from the paper full image
abstract click to expand
Single-step retrosynthesis needs both accurate first-ranked suggestions and candidate lists that are rich enough for downstream selection. We study this as a proposal-selection decomposition. Our system, RETROSPECT, combines a single Transformer proposal model, which we call the ChemAlign Transformer, with a LambdaMART reranker over structural, reaction-template, upstream-score, and optional DFT-derived descriptors. The generator is trained with hybrid root-aligned and random SMILES augmentation, Pre-LayerNorm, tied embeddings, exponential moving average weights, and a differentiable atom-balance auxiliary loss. On the full USPTO-50K test set of 5,007 reactions, the generator reaches 55.00% top-1 and 86.18% top-10 exact-match accuracy with 99.86% top-1 validity. On the merged candidate-pool benchmark used for reranking, which contains 5,007 test products and about 111 candidates per product, a LambdaMART model trained on the structural feature set reaches 59.4% top-1 with 0.7171 mean reciprocal rank. Feature ablations show that upstream proposal score and template-frequency statistics provide most of the reranking signal, while DFT and reaction-center DFT features provide smaller and less consistent gains. These results support a modular view of retrosynthesis: stronger single-model proposal and learned candidate selection are complementary, and the proposal model can serve as a drop-in component for ensemble systems such as RetroChimera (Maziarz et al., 2024)
0
0
cs.LO 2026-06-01

Modulation-reaction networks link reaction flows to their regulators

by Leo Lobski, Yoàv Montacute

Modulation-Reaction Networks

A logic with structural modalities and temporal fixed points expresses reachability, sustained production, and attractors under synchronous

Figure from the paper full image
abstract click to expand
Biochemical systems involve both the flow of matter, in which entities transform into one another via reactions, and the flow of information, in which entities regulate which reactions may occur. Boolean networks capture the latter; reaction networks capture the former. Yet no unified qualitative formalism treats regulated reactions as its principal objects of study, despite their prominence in standards such as the Systems Biology Graphical Notation Process Description (SBGN-PD) language. We introduce modulation-reaction networks (MR-networks), a mathematical framework in which entities modulate reactions through activations and inhibitions, and study their synchronous Boolean semantics. To reason about MR-networks we develop Modulation-Reaction Logic (MRL), a hybrid modal $\mu$-calculus whose modalities reason about the structure of the network and whose fixed-point operators capture temporal evolution of the computation. We establish a collection of validities, including a complete characterisation of the one-step update rule, and demonstrate the expressive power of MRL by formalising properties of biological interest such as reachability, sustained production, and presence of attractors. We show that MRL admits model-checking via an evaluation game, and introduce a bisimulation relation for MR-networks, which is proved to be invariant for all MRL-formulas. As a step towards a biologically more realistic computational model, we sketch the asynchronous semantics of MR-networks, and outline how the developments for the synchronous case transfer to the study of the asynchronous one.
0
0
cs.LG 2026-05-29

ML matches DL accuracy on protein graphs but runs 10x faster

by Aydin Wells, Francis A. Gatsi +2 more

Traditional machine learning vs. deep learning from dynamic graph representations of proteins' 3D folds in the task of protein structure classification

Head-to-head test on 72 datasets with 44,000 proteins finds no accuracy gain from deep learning, only higher cost.

Figure from the paper full image
abstract click to expand
Protein structure classification (PSC) uses supervised learning to predict a protein's CATH/SCOP(e) class from the protein's sequence or 3D structural feature(s). We already modeled 3D structures as (static) protein structure networks (PSNs), demonstrating the competitiveness of PSN-based features to sequence or direct (i.e. non-network) 3D structural features in the PSC task. More recently, we demonstrated the power of features extracted from dynamic PSNs over features extracted from static PSNs (and thus by transitivity over sequence and direct 3D structural features) in the same task. That dynamic PSN approach used traditional machine learning (ML), combining manual (pre-engineered) features with an off-the-shelf classifier. Here, we evaluate whether automatic deep learning (DL) from the dynamic PSNs yields improvements. Our evaluation on 72 datasets spanning ~44,000 CATH- or SCOPe-labeled dynamic PSNs reveals that in terms of PSC accuracy, traditional ML and DL are (close to) tied for a large majority of the datasets, while DL is on average 10+ times slower. We are the first to evaluate traditional ML vs. DL in the dynamic PSN-based PSC task.
0
0
cond-mat.soft 2026-05-28

Biomolecular phase separation thresholds track excess chemical potential

by Huan-Xiang Zhou

Determinants of Phase-Separation Propensities, Material States, and Material Properties of Biomolecular Condensates

Interaction strength, valency, and bond lifetimes set when condensates stay liquid or arrest as gels and aggregates and how viscous they bec

Figure from the paper full image
abstract click to expand
Phase separation of various materials has been studied for one and a half centuries. In the last two decades, phase separation of proteins and nucleic acids has received enormous attention, due its relevance to cellular functions. However, many of the observations on the resulting biomolecular condensates lack a theoretical underpinning. The first goal of this Account is to put forward theoretical frameworks for the phase-separation propensities, material states, and material properties of biomolecular condensates. Using these frameworks, I rationalize mechanistic interpretations from our recent experimental and computational studies, and synthesize these studies with prior literature to draw new conclusions. For phase-separation propensities, I relate the threshold (or saturation) concentration to the excess chemical potential in the dense phase, which in turn depends on intermolecular interaction strength and valency. For material states, I posit that liquid droplets form via complete phase separation, whereas amorphous dense liquids, reversible aggregates, and gels arise from premature termination of spinodal decomposition, due to overly weak or overly strong interactions or directional interactions. In particular, gels and aggregates are different forms of dynamically arrested states, with gels driven by tip growth via directional interactions whereas aggregates driven by monomer addition at interior sites to maximize valency. For material properties, I highlight the crucial roles of the stress relaxation time, which is determined by the mean lifetime of intermolecular bonds in a condensate. This relaxation time dictates how the condensate manifests viscoelasticity, including shear thickening and shear thinning, and accounts for the wide variation in zero-shear viscosity among different condensates.
0
0
q-bio.MN 2026-05-26

Any RAF set is stoichiometrically autocatalytic

by Richard Golnik, Thomas Gatter +3 more

Bridging two theoretical frameworks of autocatalysis: RAF sets and stoichiometric autocatalysis

Proof unifies two definitions by showing RAF sets meet net-production criteria under general conditions.

abstract click to expand
Autocatalysis lies at the heart of many (bio)chemical processes and is key to processes leading up to the origin of life. Two seemingly very different formalisms have emerged that define autocatalysis. Kauffman introduced collective autocatalysis to describe systems of molecules that mutually catalyze each other's formation, emphasizing the self-sustaining character of autocatalytic systems. This view is mathematically formalized in the theory of Reflexively Autocatalytic and Food-generated sets (RAF). In parallel, stoichiometric autocatalysis emerged from the theory of Chemical Reaction Networks (CRN), focusing on the net-productive, self-amplifying character of autocatalytic subnetworks. These two frameworks have coexisted independently in the literature, since RAF theory considers each reaction as explicitly catalyzed, while the CRN approach often excludes explicitly catalyzed reactions altogether. Nevertheless, both frameworks describe reaction networks and thus admit a common mathematical representation in terms of stoichiometric matrices. We highlight this connection and show that the two formalisms are less disparate than they might appear. To illustrate this point we prove that, under mild and general conditions, any RAF is stoichiometrically autocatalytic.
0
0
cs.DB 2026-05-25

Knowledge graph preserves metabolomics provenance links

by Matthieu Féraud, Dina Boukhajou +2 more

MetaboKG: An Analysis-centric Knowledge Graph Framework for Untargeted Metabolomics

Workflow and extended identifiers turn scattered spectra and annotations into traceable, queryable data.

Figure from the paper full image
abstract click to expand
Untargeted metabolomics generates large volumes of tandem mass spectrometry (MS/MS) data and computational annotations that can reveal molecular mechanisms across organisms and environments. Public reuse has improved through harmonized repository metadata and access infrastructures such as Pan-ReDU, and through metabolomics knowledge graphs such as ENPKG and METRIN-KG. Yet the analytical layer remains fragmented: spectra, features, workflow outputs, annotations, confidence evidence, and contextual metadata are still scattered across repositories and tabular artifacts. We present MetaboKG, an analysis-centric knowledge graph framework for engineering reusable metabolomics knowledge from public repositories, metadata, and GNPS molecular network results. MetaboKG contributes a transformation workflow that preserves links between repository exports, analytical files, spectra, features, and annotation results; a semantic model grounded in PROV-O and SIO and aligned with the Mass Spectrometry ontology (MS), ChEBI, NCBITaxon, ENVO, and NCIT to represent provenance, analytical evidence, metadata attributes, and controlled vocabulary terms; and a Universal Annotation Identifier strategy extending the Universal Spectrum Identifier (USI) with workflow-specific components for late binding, incremental ingestion, and post hoc linkage across analyses. We demonstrate MetaboKG at the public-repository scale on 680 GNPS molecular networking results and evaluate it through competency questions covering biochemical enrichment, environmental specificity, and cross instrument analytical variation. Results show that graph-based integration supports traceable annotation reuse and reproducible SPARQL exploration of biochemical relationships that remain fragmented across repository-native resources.
0
0
q-bio.MN 2026-05-25

Uniform sampling lifts canalizing sensitivity toward 1.183

by Ahana Ghosh, Claus Kadelka

Uniform sampling of canalizing Boolean functions reveals hidden biases in Boolean network analysis

Parameter sampling had locked it at exactly 1 by suppressing high-sensitivity functions, leading to understated enrichment of low-sensitivit

Figure from the paper full image
abstract click to expand
Boolean networks are widely used to model gene regulatory systems, where ensembles of Boolean functions serve as null models for assessing structural and dynamical properties. A common approach generates canalizing and nested canalizing functions by sampling their defining parameters uniformly at random. However, because multiple parameterizations can represent the same Boolean function, this induces a non-uniform distribution over distinct functions and systematically biases random ensembles. Here, we develop efficient algorithms for uniform sampling of Boolean functions with prescribed exact or minimal canalizing depth that correct this bias. Our approach combines dynamic programming for sampling canalizing layer structures with rejection-based methods and is implemented in BoolForge. We show that the sampling scheme substantially affects commonly studied function-level metrics. Under traditional parameter-uniform sampling, the expected average sensitivity of nested canalizing functions equals one independent of the number of variables. In contrast, under function-uniform sampling, the expected sensitivity increases with system size and numerically approaches approximately 1.183. This discrepancy arises from an exponential suppression of high-sensitivity functions under parameter-based sampling. These differences propagate to Boolean network models, affecting conclusions about robustness, stability, attractor structure, and baseline dynamical expectations. Revisiting 122 published Boolean gene regulatory network models, we show that function-uniform null models reveal a substantially stronger enrichment of low-sensitivity canalizing architectures than previously inferred. Our results demonstrate that widely used null models systematically underestimate baseline sensitivity and can therefore distort assessments of the stabilizing role of canalization in biological networks.
0
0
q-bio.MN 2026-05-22 2 theorems

Clustering systems realize as level-k networks iff block overlap minima ≤ k

by Shilong Dai, Yangjing Long

A Characterization of Level-k Realizability for Clustering Systems

The theorem ties the required reticulation level to the smallest families that generate all overlap intersections inside each Hasse-diagram块

abstract click to expand
We give a Hasse-diagram characterization of when a clustering system $\mathcal C$ on a finite taxa set $X$ is the hardwired clustering system $C_N$ of a rooted level-$k$ network. For each non-trivial block $B$ of $H=\mathcal H[\mathcal C]$, we define a parameter $\mu(B)$ using minimum families of clusters that generate all overlap-intersections inside $B$. The main theorem proves that there exists a rooted level-$k$ network $N$ with $C_N=\mathcal C$ if and only if $\mu(B)\le k$ for every non-trivial block $B$ of $H$. The necessity proof shows that overlap-intersection pieces must be represented by non-root hybrid vertices in any realizing block. The sufficiency proof is constructive: starting from the Hasse diagram, it iteratively splits selected hybrid vertices, preserves the hardwired clustering system, and terminates with a realization whose level is bounded by the block-wise values of $\mu$.
0
0
cs.LG 2026-05-21 2 theorems

One embedding predicts conditions and retrieves precedents

by Shreyas Vinaya Sathyanarayana, Raja Sekhar Pappala +1 more

HiRes: Inspectable Precedent Memory for Reaction Condition Recommendation

HiRes reaches top accuracies on catalyst, solvent and reagent tasks while letting users inspect similar past reactions for justification.

Figure from the paper full image
abstract click to expand
Reaction condition recommendation sits immediately after retrosynthetic disconnection selection, and in practice, chemists require both accurate predictions and the precedents that justify them. We present HiRes (Hierarchical Reaction Representations), a retrieval-augmented condition recommendation system whose learned reaction space serves as both a classifier feature and an inspectable precedent memory. The model combines a graph encoder, transformation-aware cross-attention, multi-stream reaction fusion, and a k-NN retrieval layer. HiRes achieves state-of-the-art performance among primary-slot USPTO-Condition models, reaching Catalyst, Solvent, and Reagent top-1 accuracies (Acc@1) of 0.929, 0.534, and 0.530 respectively. It ties the best reported baseline on Catalyst while outperforming models such as REACON on Solvent and Reagent. Furthermore, paired bootstrap analysis demonstrates that integrating retrieval with learned condition heads provides statistically significant gains for solvent and reagent selection over purely parametric approaches. Ultimately, HiRes bridges the gap between predictive accuracy and chemical interpretability, offering a single representation that supplies both competitive recommendations and the concrete chemical precedents necessary for practical synthesis planning.
0
0
q-bio.BM 2026-05-20 2 theorems

Microbial chemistry clusters in distinct elemental space

by Pilar C. Vergeli, Cole Mathis +6 more

Elemental Stoichiometry as an Ecological Biosignature with Applications to Life Detection

Enrichment in heteroatoms and shifted ratios set biological samples apart from synthetic and planetary data, offering a statistical test for

Figure from the paper full image
abstract click to expand
The vast chemical space of possible small molecules, estimated at 10^60 compounds for molecules composed of just C, N, O, and S, is only sparsely occupied by biology. We propose that where life selects molecules within this space constitutes a detectable ecological signature: a fingerprint not of specific compounds, but of the statistical structure of elemental composition across molecules sam-pled from ecological systems. Here we introduce a framework combining Van Krevelen diagrams and element scaling laws to characterize the elemental composition of regions of chemical space occupied by biological systems and contrast them with other chemical systems. Applying this framework to 11,834 microbial metagenomic samples, we show that microbial metabolisms occupy a region of chemical space, which is enriched in heteroatoms such as P, S, N, and O relative to C, shifted toward higher O:C and H:C ratios. We observe sublinear element scaling with system size, yielding insights into how elemental constraints dictate how biological systems occupy chemical space. These patterns are distinct from a sample of 18,000 compounds from the comprehensive Reaxys synthetic chemical database. Critically, datasets from molecules detected in planetary science mission data occupy statistically distinct regions from both terrestrial biological and Reaxys distributions, demonstrating that with standardized methods for data collection, the approach could be developed to discriminate biotic from abiotic chemical signatures in small molecule data from planetary science missions. Our work shows how a combination of Van Krevelen fingerprinting and elemental scaling laws can provide a new class of ecological biosignatures for life detection leveraging mass spectrometric data from planetary missions, which could generalize beyond Earth's specific biochemistry.
0
0
q-bio.GN 2026-05-19 Recognition

Algorithm extracts active TF sites from mutation groups

by Doruk Efe Gökmen, Rosalind Wenshan Pan +5 more

Informational blueprints reveal condition-dependent gene regulatory architectures

Optimised filters across full promoters identify correlated mutations with biggest collective impact on expression under given conditions.

Figure from the paper full image
abstract click to expand
While coding regions in the genome have a direct interpretation in terms of protein products, significant fractions are non-coding and yet control essential biological functions. Unlike the genetic code, there is no "lookup table" that identifies where regulatory proteins, known as transcription factors (TFs), bind. Here, we extract these binding sites by distilling sequences of nucleotide letters into collective coordinates (hyperletters) representing the binding sites that are active under specific environmental conditions. Going beyond local information footprints between individual bases and expression levels, our $\textit{information blueprint}$ algorithm compresses the global information by optimising filters that simultaneously scan an entire promoter sequence. Inspired by renormalisation-group techniques, we identify TF binding sites as coarse-grained variables combining groups of correlated mutations with the highest collective impact on gene expression. We validate our approach on experimental data for $\textit{E. coli}$ and discover novel regulatory elements illustrating its deployment at scale across growth conditions.
0
0
q-bio.MN 2026-05-18

Control theory ranks aging interventions by restoration cost

by Alex Zhavoronkov

Control Laws in Aging and Longevity

By modeling drugs as non-commuting vector fields on state space, the framework outputs sequences that reduce the minimum safe control cost f

Figure from the paper full image
abstract click to expand
Existing aging theories describe what changes with age but do not prescribe how to intervene. We propose a control-theoretic framework that is not merely descriptive but prescriptive: it specifies which intervention, at which dose and sequence, under which safety constraints, will restore a measured biological state to a functional region. Aging is defined as the progressive loss of safe controllability; biological age is the minimum safe control cost of functional restoration. Drugs are modeled as vector fields on biological state space whose non-commutativity, quantified by Lie brackets, predicts that intervention order determines outcome. The core differentiation from prior theories is operational: the framework outputs ranked targets, optimal sequences, safety-constrained protocols, and falsifiable predictions directly usable in drug discovery, rather than mechanistic ontologies or correlative biomarkers. We present a five-dimensional ODE model with analytic Lie-bracket derivation, a modality-aware control layer, three translational case studies, an implementation architecture with power analysis, and empirical scoring of aging interventions across five biological epochs. Twenty falsifiable predictions are enumerated. The central claim is that control-value reduction predicts translational success better than Hallmark annotation or biomarker reversal alone. If validated, this provides the missing interventional layer connecting aging biology to rational gerotherapeutic discovery.
0
0
math.DS 2026-05-18 1 theorem

Sign conditions on polynomials guarantee spatial instabilities

by Carsten Conradi, Maya Mincheva +1 more

Conditions for spatial instabilities and pattern formation from monomial steady state parameterizations

Monomial steady-state parameterization turns Turing-like pattern criteria into algebraic inequalities on rate constants and diffusion rates.

Figure from the paper full image
abstract click to expand
We study the onset of spatial instabilities in reaction networks where the spatially homogeneous system admits a steady state parameterization. We formulate a sufficient condition -- based on the signs of the constant and leading coefficients of the characteristic polynomial of the linearized Jacobian scaled by the diffusion coefficients -- that guarantees a Turing-like instability to spatially inhomogeneous solutions on appropriately chosen domains $\Omega$. We also present a specific condition on the domain size $|\Omega|$ required to trigger this instability. As a consequence of employing a monomial parameterization, these conditions take the form of algebraic polynomial inequalities involving only rate constants and diffusion coefficients. We apply these ideas to a network describing the sequential and distributive (de-)phosphorylation of a protein at two binding sites, ultimately deriving a condition involving only the four catalytic constants of the enzymes and the diffusion coefficients of the four enzyme-substrate complexes that guarantees a Turing-like instability.
0
0
math.PR 2026-05-15 Recognition

Biochemical networks cannot suppress noise simultaneously across components

by David F. Anderson

Noise Tradeoffs, Stationary Information Flow, and Structural Balance in Unit-Birth Networks

Information-flow identities prove that sub-Poissonian levels in multiple parts require frustrated topologies.

Figure from the paper full image
abstract click to expand
In 2019, Paulsson and collaborators conjectured that stochastic biochemical control networks have fundamental limits on how much intrinsic noise can be simultaneously suppressed across multiple components. Ripsman, Kell, and Hilfinger recently proposed a formal proof strategy for unit-birth models based on a stationary information-theoretic decomposition. Here, we provide a rigorous mathematical justification for this argument. We consider continuous-time Markov chains on $\Z^N_{\ge 0}$ in which each component is degraded linearly and produced in unit births at a state-dependent rate depending on the other components but not on itself. Noise in component $i$ is measured by the Fano factor $F_{X_i}$, the ratio of stationary variance to mean, with Poisson value $1$ as baseline. Our first contribution is to isolate explicit hypotheses on moments, mean birth rates, and total-rate growth under which the formal information-flow identities can be rigorously justified. Following the proof outline of Ripsman, Kell, and Hilfinger, we then prove the conjecture. Our second contribution is to make these hypotheses checkable: a uniform positive lower bound on the birth rates and at-most-linear total growth dominated by the weakest degradation rate suffices, via Foster--Lyapunov methods. Our third contribution is a structural strengthening. Under a signed monotonicity condition on the rate functions, satisfied by structurally balanced signed interaction networks, we prove that the stationary distribution is associated with respect to the corresponding signed partial order. This upgrades the global tradeoff to the termwise bound $F_{X_i}\ge 1$ for every $i$. Hence, within the signed-monotone subclass, sub-Poissonian noise requires a frustrated interaction topology.
0
0
cs.LG 2026-05-15 2 theorems

PACER optimizes causal graphs directly over valid DAGs

by Ramon Viñas Torné, Sílvia Fàbregas Salazar +5 more

PACER: Acyclic Causal Discovery from Large-Scale Interventional Data

A joint model of permutations and edge probabilities removes penalties and yields up to 100x speedups on large interventional datasets.

Figure from the paper full image
abstract click to expand
Inferring the structure of directed acyclic graphs (DAGs) from data is a central challenge in causal discovery, particularly in modern high-dimensional settings where large-scale interventional data are increasingly available. While interventional data can improve identifiability, existing methods remain limited by soft acyclicity constraints, leading to optimization over invalid cyclic graphs, numerical instability, and reduced scalability. We introduce PACER (Perturbation-driven Acyclic Causal Edge Recovery), a scalable framework for causal discovery that guarantees acyclicity by construction. PACER parameterizes a distribution over DAGs through a joint model of variable permutations and edge probabilities, enabling direct optimization over valid causal structures without surrogate penalties. The framework supports a unified likelihood-based treatment of observational and interventional data, flexible conditional density models, and the incorporation of structural prior knowledge. For linear-Gaussian mechanisms, we derive closed-form expressions for the expected interventional log-likelihood and its gradients, yielding substantial computational gains. Empirically, PACER matches or exceeds state-of-the-art methods on protein signaling and large-scale genetic perturbation benchmarks, while scaling efficiently to networks with thousands of variables and achieving up to two orders of magnitude speedups over penalty-based differentiable approaches. These results demonstrate that exact and scalable causal discovery from high-dimensional perturbation data is achievable through principled search space design.
0
0
q-bio.MN 2026-05-15 2 theorems

Methylation feedback dynamically reshapes gene expression

by Kaifeng Wang, Ming Han

Autonomous Reshaping of Expression Landscapes by DNA Methylation

Coupled models show preferred states drifting continuously, committing without multiple stable states.

Figure from the paper full image
abstract click to expand
DNA methylation is usually treated as an epigenetic memory mark: transcriptional history is written into regulatory DNA and later stabilizes a chosen cell identity. This picture explains persistence, but it makes memory passive. Here we show that the same promoter-level coupling required for methylation memory can instead turn methylation into an internal control variable for regulatory dynamics. Transcription-factor occupancy protects regulatory DNA from methylation, while methylation shifts later transcription-factor binding thresholds. Under time-scale separation, this reciprocal loop separates into fast expression dynamics conditioned on methylation and a slow methylation flow written by expression. Minimal promoter, self-activation, and fate-toggle models show that this feedback does more than preserve a past state: it autonomously reshapes the expression landscape. In a methylation-coupled toggle, the preferred expression state can move continuously through single-well drift, allowing commitment without first entering a multiwell regime. Stochastic simulations further show that evolving methylation reduces fate reversals relative to a frozen landscape, making weak early expression bias more predictive of later fate. These results recast DNA methylation from a downstream stabilizer of cell identity into a slow dynamical coordinate that can help determine how regulatory states are chosen.
0
0
q-bio.MN 2026-05-13 2 theorems

Vertex measures match VR in spotting cancer genes

by Edmara Viana, Rodrigo Henrique Ramos +2 more

Scalable vertex guided filtrations identify structurally relevant genes in cancer networks

They recover known essentials, nominate new drivers, and compute cavities where full VR fails.

Figure from the paper full image
abstract click to expand
Topological data analysis (TDA) has established itself as a useful tool for capturing multiscale structures in complex networks, such as connected components, cycles, and cavities. Although Vietoris-Rips (VR) filtering is widely used in network analysis, it tends to be computationally expensive, especially for large networks. This work explores vertex function-based (VFB) filtering based on network measures, applying persistent homology to identify relevant topological structures in cancer-associated protein networks, and compares its effectiveness with the VR approach. The results show that VFB reproduces the second-order structures (Betti-2) identified by VR, recovering previously reported essential genes. In addition, VFB detected new driver genes, confirmed in databases such as IntOGen and NCG, and allowed analysis of third-order structures (Betti-3) that was not feasible with VR. Thus, VFB represents a scalable alternative to VR, preserving biological interpretability and complementing classical network metrics.
0
0
math.DS 2026-05-13 2 theorems

One futile cycle network combines bistability with fixed final product

by Badal Joshi, Tung D. Nguyen +1 more

Bistability, Absolute Concentration Robustness, and Hysteresis in Dual-Site Futile Cycles with Bifunctional Enzymes

Two stable states share identical final modification levels despite different intermediates, a feature absent from the other three networks.

Figure from the paper full image
abstract click to expand
Bifunctional enzymes, which catalyze both the forward and reverse steps of a substrate modification reaction, arise naturally in bacterial two-component signaling systems and metabolic regulation. Beyond their well-known role in conferring absolute concentration robustness (ACR) on substrate species, bifunctional enzymes profoundly shape the dynamical landscape of the networks in which they appear. We study a class of dual-site futile cycles in which the reverse modification steps are carried out by bifunctional enzyme-substrate compounds, and provide a complete mathematical analysis of all four such networks, characterizing the existence, number, and stability of steady states, as well as the bifurcation structure as total substrate is varied. All four networks admit boundary steady states, in contrast to the non-bifunctional case. The networks differ in the number and stability of boundary steady states, in the maximum number of positive steady states (ranging from two to four), and in whether bistability is present. In two networks, a transcritical bifurcation connects the boundary and positive steady state branches; in one case this is a backward bifurcation, producing hysteresis. Perhaps the most striking phenomenon occurs in one of the four networks, which simultaneously exhibits bistability and ACR in the final modification state, where the system can settle into either of two stable steady states with different intermediate concentrations yet identical final product concentration.
0
0
math.PR 2026-05-11 2 theorems

Bursting gene networks reach unique equilibria with explicit rates

by Mathilde Gaillard, Ulysse Herbach

Quantitative ergodicity for gene regulatory networks with transcriptional bursting

Coupling arguments prove existence, uniqueness, and Wasserstein convergence bounds for any number of genes under regular jump rates.

Figure from the paper full image
abstract click to expand
We study the long-term behavior of two piecewise-deterministic Markov processes used to model stochastic gene regulatory networks with bursting dynamics. Under regularity assumptions on the jump rate, we prove the existence and uniqueness of the stationary distribution for an arbitrary number of interacting genes and an arbitrary strength of interaction. Using coupling methods, we also provide explicit upper bounds for the convergence to equilibrium in terms of Wasserstein distances.
0
0
q-bio.MN 2026-05-11 2 theorems

GNN explanations spot disease hubs via decaying attribution

by Kyle Higgins, Ivan Laponogov +2 more

Graph neural network explanations reveal a topological signature of disease-associated hubs in biological networks

Attribution peaks in the immediate neighborhood of cancer hubs and fades outward, improving prioritization of key genes and pathways in TCGA

Figure from the paper full image
abstract click to expand
Graph neural networks (GNNs) are increasingly used to model biological systems, yet the reliability of post-hoc explanation methods for recovering meaningful molecular mechanisms remains unclear. Here, we systematically evaluate four widely used approaches: Saliency Attribution (SA), Integrated Gradients (IG), GNNExplainer, and Layer-wise Relevance Propagation (LRP) for identifying disease-relevant structure in breast cancer RNA-seq data projected onto a protein-protein interaction network. Using synthetic benchmarks with known ground-truth motifs, we show that explanation methods recover distinct signal organizations: SA performs best for sparse single-node drivers, whereas IG and LRP preferentially recover distributed pathway-like and cascade-like signals. In TCGA BRCA data, we identify a consistent topological signature of disease-associated hubs in which attribution peaks in the immediate 1-hop neighborhood and decays across successive network shells, a pattern most pronounced for IG and LRP and associated with strong enrichment of known cancer hubs. We further observe a trade-off between local hub enrichment and global gene ranking performance, with IG optimizing local enrichment and SA achieving superior global discrimination. Motivated by these complementary behaviors, we introduce a framework combining a shell-based hub score with consensus ranking across explainers. Consensus scores improve prioritization of canonical cancer genes (TP53, BRCA1, ESR1, MYC), reduce dependence on node degree, and, especially when tuned, outperform individual methods. Pathway enrichment further reveals improved recovery of biologically coherent cancer programs, including ERBB2, RTK, MAPK, immune, and cytokine signaling. Together, these results demonstrate that topology-aware integration of graph explanations can improve biological interpretability and biologically relevant molecular recovery.
0
0
math.DS 2026-05-11 2 theorems

Switching control makes gene densities forget their initial state

by Christian Fernández, Manuel Pájaro +2 more

Predictive-Switching Control of Stochastic Gene Regulatory Networks: A Contractive PIDE Framework

L1-contractivity of the closed-loop PIDE ensures the probability distribution evolves independently of starting conditions.

Figure from the paper full image
abstract click to expand
This paper develops a predictive switching control algorithm for stochastic gene regulatory networks described by a Partial Integro-Differential Equation (PIDE) model, which enables direct shape control of the probability density function. Control inputs are selected from a finite candidate set to minimize a prescribed cost functional. A hybrid framework is proposed for scalability in higher-dimensional systems, using neural networks to approximate the control policy. A central theoretical contribution is a contraction-based analysis of the closed-loop PIDE dynamics. The paper establishes $L^ 1$-contractivity under the proposed control scheme, yielding formal stability guarantees and showing that the evolution of the probability density becomes progressively independent of the initial condition. Moreover, under strictly positive leakage terms, exponential convergence is obtained. The effectiveness and flexibility of the approach, together with the theoretical contractivity results, are illustrated through numerical simulations on three representative examples of increasing dimensionality.
0
0
q-bio.MN 2026-05-11 2 theorems

MaxSMT infers qualitative models up to 1300 genes from noisy data

by Ondřej Huvar, Nikola Beneš +3 more

Inference of Qualitative Models from Steady-State Data via Weighted MaxSMT

Weighted soft constraints let the solver keep the best-fitting model even when some biological observations conflict.

Figure from the paper full image
abstract click to expand
Qualitative models provide crucial instruments for modelling complex biological systems. While advances in automated reasoning and symbolic encodings have enabled rigorous inference of these models from data, the process remains highly fragile. First, biological measurement errors inevitably propagate into formal model specifications. Second, when a specification becomes unsatisfiable, distinguishing between fundamental design flaws and minor technical errors is notoriously difficult. This uncertainty often leads to under-specification, as it is unclear which observations are still ``safe'' to incorporate. To overcome these challenges, we introduce a robust inference method based on weighted MaxSMT. By encoding uncertain biological observations as weighted soft constraints, our approach enables the solver to identify a model best reflecting the observations, even with some conflicting constraints. Our method allows for Boolean and multi-valued variable domains, alongside observations derived from discretisation (level constraints) and differential expression (ordering constraints). We show our approach can be used to successfully infer neural cell differentiation models from prior-knowledge networks with 200--1300 genes using ordering constraints on all included genes.
1 0
0
cond-mat.stat-mech 2026-05-08

Burst timing shapes vesicle signaling activation

by Jan Hauke, Julian B. Voits +1 more

Activation in Vesicle-Mediated Signaling Shaped by Batch Arrival Statistics

Different release patterns with identical average rates produce distinct times to reach activation thresholds through fluctuation effects.

Figure from the paper full image
abstract click to expand
Vesicle-mediated secretion of ions or molecules is a central mechanism of cellular communication, for example in processes such as neurotransmission or hormone release. These events are inherently stochastic: vesicle fusions lead to bursts of variable sizes, releasing discrete packets of transmitters that are subsequently cleared or degraded. The dynamics break time-reversal symmetry due to the interplay of spontaneous bursts and continuous degradation. Using generating functions and a recursion relation, we derive an exact solution for the full time-dependent probability distribution of a general batch arrival-degradation model. This framework also enables a full analysis of first-passage times to a concentration threshold representing downstream activation. We show that activation kinetics are not determined by mean dynamics alone, but depend sensitively on the temporal statistics of arrival events, batch-size variability, and degradation. In particular, different arrival processes with identical mean rates can lead to qualitatively distinct first-passage behavior, reflecting the role of time-asymmetric fluctuations. We also discuss extensions incorporating vesicle depletion. Our results provide a transparent link between stochastic release dynamics and activation timing in vesicle-mediated signaling.
0
0
q-bio.MN 2026-05-06 3 theorems

Ocean microbe networks exceed null modularity by 0.15-0.40

by Martin G. Frasch

Modularity Emerges from Action-Functional Constraints in Marine Metabolic Networks: A Biology-Scale Validation of the Network-Weighted Action Principle

The excess aligns with recurring functional modules, showing cost-minimization at work in real metabolic networks.

Figure from the paper full image
abstract click to expand
Biological systems operate under simultaneous energetic and informational constraints, yet direct evidence that such constraints shape real metabolic networks is limited. The Network-Weighted Action Principle predicts that networks under these constraints should organize toward high modularity. We tested this prediction in marine microbiome metabolic networks reconstructed from Tara Oceans metagenomes using two complementary approaches. Composite metrics of protein-deployment efficiency and functional-repertoire complexity (n=10) failed under causal-inference diagnostics, with apparent structure dominated by shared-component bias. In contrast, network modularity (n=7) was high (Q ~ 0.987), but this value was shown to arise from sparsity alone. The biologically meaningful signal is the excess over null models: modularity exceeded configuration-model, label-permutation, and bipartite-incidence nulls by Delta Q ~ 0.15-0.40 (p < 0.001), with the largest effect under the bipartite-incidence control. Fine-grained communities recovered by the network partition are not arbitrary: 25% recur across samples, and the most consistent modules map to known functional units, including enzyme subunits, biosynthetic sequences, and transporter complexes. Together, these results show that modularity excess - rather than absolute modularity - is the appropriate signature of biological organization, and that such excess is consistent with cost-minimization principles operating at the scale of natural metabolic networks.
0
0
cond-mat.stat-mech 2026-05-04

Log centroid method recovers Kramers scaling in noisy quiescent systems

by Yefan Wu

Breakdown of Adiabatic Scaling and Noise-Induced Functional Synchronization in Deeply Quiescent Excitable Systems

It filters jitter from broad resonance valleys and shows how noise drives functional synchronization in coupled cells.

Figure from the paper full image
abstract click to expand
Coherence resonance (CR) characterizes noise-induced regularity in excitable systems, yet its evaluation in quiescent biological media is often obscured by flattened energy landscapes and complex nonlinear dynamics. In this study, we investigate the stochastic dynamics of a 3D Sherman-Rinzel-Keizer (SRK) model driven by multiplicative Feller noise. We show that traditional extremal evaluations of CR encounter a "bathtub effect", a broad resonance valley that can lead to statistical inaccuracies. To address this, we propose a logarithmic centroid extraction method, which filters out stochastic jitter and recovers the underlying adiabatic Kramers scaling with high linearity. Furthermore, we identify the physical boundary where this adiabatic approximation breaks down under the strong-noise limit. Extending our analysis to gap-junction coupled systems, we observe a noise-induced transition from sub-threshold physiological shivering (characterized by statistical correlation but negligible functional output) to macroscopic functional synchronization. Our results provide a mathematical framework for extracting optimal noise intensities in broad energy valleys and offer insights into how quiescent biological systems utilize stochastic fluctuations for functional recovery.
0
0
q-bio.MN 2026-05-04

Logistic kernel prevents Hill-function shutdown in gene ODE models

by Ismail Belgacem

Numerical Reliability of Logistic Gene Regulatory Network Models: Preventing Expression Shutdown and Robust Integration of Boolean-Derived ODE Systems

Hill produces absorbing off-states and complex values on non-integer exponents; logistic yields reliable trajectories and explicit error bou

Figure from the paper full image
abstract click to expand
Gene regulatory networks are routinely translated from Boolean update rules into large continuous ODE systems integrated numerically for attractor identification, sensitivity analysis, and control design. The reliability of that integration depends critically on the sigmoidal kernel representing regulation. This simulation study shows that the Hill function -- the near-universal choice -- is a generically unreliable kernel, while the logistic function is a robust replacement. Two failure modes are demonstrated. First, because the Hill function vanishes at zero input, bistable circuits acquire an absorbing off-state: with experimentally grounded \textit{E. coli} galactose-operon autoregulation parameters, a Hill model stays trapped below the unstable separatrix, whereas the logistic model -- whose basal rate is strictly positive by construction -- escapes in about $44$~minutes through basal production alone, matching an analytical estimate of ${\approx}58$~min. A saddle-node analysis characterises the bistable window via an explicit transcendental equation and identifies the threshold $\lambda\theta=2$ separating monostable from bistable regimes. Second, when the Hill exponent is non-integer -- as in dose-response fits -- the power law $x^n=e^{n\ln x}$ turns complex-valued whenever a solver overshoots into negative concentrations. On an $80$-gene Boolean-derived benchmark with $n\approx3.509$, the Hill solver is silently contaminated by complex values from $t\approx52.64$, yielding smooth but spurious trajectories, whereas the logistic formulation completes $t\in[0,200]$ without a single warning. Because the logistic vector field is globally Lipschitz with explicit constant, we further prove an a priori global-error bound of classical order -- a guarantee structurally unavailable to the Hill formulation.
3 0
0
q-bio.MN 2026-04-30

Pruning protocol shrinks osteogenesis network models to six viable ones

by Jacques Demongeot, Alonso Espinoza Rojas +4 more

Parsimonious computational inference protocol for Boolean networks: Application to osteogenesis

From 51,138 candidates, successive filters keep only those matching known biology and remove spurious attractors.

Figure from the paper full image
abstract click to expand
Boolean networks are powerful mathematical tools for modeling the qualitative dynamics of genetic regulation. Yet inferred models often generate spurious attractors that lack biological viability. In this paper, we propose a parsimonious computational framework to systematically refine Boolean network models by eliminating these non-biological asymptotic behaviors while strictly preserving known, biologically relevant attractors. Through an exhaustive exploration of local function substitutions, we generate a comprehensive set of candidate models. To identify the most biologically consistent networks, we implement an incremental pruning protocol that filters candidates based on structural interaction digraph similarity, attraction basin topological organization, trajectorial isomorphism, and the minimization of dynamical instability and frustration. We apply this methodology to a 9-node genetic control model of the osteogenesis regulation network. Our protocol effectively evaluates a syntactic search space of 51,138 potential networks, ultimately narrowing them down to a robust family of 6 parsimonious models that are fully compatible with current biological knowledge.
0
0
physics.bio-ph 2026-04-29

Bayesian method infers motif rates from nucleic acid ligation counts

by Johannes Harth-Kitzerow, Ulrich Gerland +1 more

Bayesian Rate Inference for Sequence Motif Dynamics in Systems of Reactive Nucleic Acids

Framework matches simple models to complex simulations as a step toward experimental rate inference with uncertainty

Figure from the paper full image
abstract click to expand
The RNA world hypothesis suggests a pathway of how life emerged on early earth. It assumes that life started with RNA based systems, capable of storing, transmitting and replicating information, envisioning that monomers and short RNA oligomers interact to form longer strands, eventually becoming catalytically active ribozymes. Key reactions in RNA pools are hybridization, dehybridization, templated ligation, and cleavage. Those reactions depend on many environmental parameters and the wide range of possible configurations among interacting strands. In order to scan such high dimensional parameter spaces, efficient descriptions are needed. Motif rate equations project complex strand reactor dynamics onto sequence motif space. Here we present a Bayesian inference framework to infer their parameters from ligation count data produced by strand reactor simulations. This provides a framework to match the simpler motif rate equations to more complex simulations. Additionally, it is a step towards inferring reaction rate constants directly from experimental data, including rigorous uncertainty estimation. This could be an essential procedure to connect theory and experiment, and deepen our understanding of the essential features necessary for life to emerge.
0
0
q-bio.MN 2026-04-28

Biophysical consistency separates true gene models from data fits

by Suryanarayana Maddu, Victor Chardès +1 more

Learning biophysical models of gene regulation with probability flow matching

Probability flow matching shows only consistent models recover lineage transitions and perturbation responses in hematopoiesis.

Figure from the paper full image
abstract click to expand
Cellular differentiation is governed by gene regulatory networks, the high-dimensional stochastic biochemical systems that determine the transcriptional landscape and mediate cellular responses to signals and perturbations. Although single-cell RNA sequencing provides quantitative snapshots of the transcriptome, current methods for inferring gene-regulatory dynamics often lack mechanistic interpretability and fail to generalize to unseen conditions. Here we introduce Probability Flow Matching (PFM), a scalable framework for learning biophysically consistent stochastic processes directly from time-resolved single-cell measurements. Applying PFM to three hematopoiesis datasets, we show that models with similar interpolation accuracy can encode fundamentally different dynamics, with only biophysically consistent formulations accurately capturing mechanisms of lineage transitions, fate specification, and gene perturbation responses. We further demonstrate that PFM accommodates unbalanced populations, enabling simultaneous inference of cellular proliferation and death dynamics. Together, these results establish PFM as a flexible, scalable framework for integrating mechanistic modeling with single-cell omics.
0
0
cs.LG 2026-04-28

Frozen confidence scores boost multi-omics cancer subtyping

by Boyang Fan, Hengchuang Yin +6 more

CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification

Per-patient reliability estimates guide graph building, delivering 4 percent accuracy gains and enabling transfer between cancers.

Figure from the paper full image
abstract click to expand
Motivation: Multi-omics integration can improve cancer subtyping, but modality informativeness and noise vary across cancer types and patients. Existing graph-based methods optimize modality weights jointly with the classification objective and therefore lack independent reliability estimates, so low-quality omics distort patient similarity graphs and amplify noise through message passing. Results: We propose CMGL, a two-stage framework that estimates per-sample modality reliability through evidential deep learning and uses the frozen confidence scores to guide cross-omics fusion and graph construction. On four MLOmics cancer-subtype tasks and the 32-class pan-cancer task, CMGL consistently improves over the strongest baseline, surpassing it by 4.03% in average accuracy on the four single-cancer tasks. Its representations recover the PAM50 intrinsic subtypes of breast invasive carcinoma (BRCA), and the BRCA-trained model transfers without fine-tuning to kidney renal clear cell carcinoma (KIRC), stratifying patients into prognostically distinct groups.
1 0
0
physics.bio-ph 2026-04-21

Noise switches frustrated genes to set cell differentiation timing

by Davey Plugers, Kunihiko Kaneko

Noise-Driven Differentiation via Gene Frustration and Epigenetic Fixation

Logarithmic dependence on noise strength and input-biased fate selection emerge from switching followed by epigenetic locking.

Figure from the paper full image
abstract click to expand
Gene expression in cells is stochastic, yet differentiation is robust. We propose a mechanism in which frustrated genes with weakly stable intermediate expression undergo noise-driven switching between basins of attraction, followed by irreversible fate fixation through slow epigenetic feedback. Regulatory interactions amplify effective noise and promote differentiation. We derive analytic expression for the logarithmic dependence of differentiation time on noise strength and input-dependent cell-fate selection, and demonstrate homeorhesis, the dynamical robustness of the epigenetic landscape.
0
0
cs.ET 2026-04-20

Bacterial growth curves classify nonlinear patterns as reservoirs

by Laura Alonso Bartolomé (MICALIS, Mnemosyne) +2 more

What Makes a Bacterial Model a Good Reservoir Computer? Predicting Performance from Separability and Similarity

Simulations of multiple species and mutants show high accuracy tied to differences in state-matrix ranks, pointing to living systems for low

Figure from the paper full image
abstract click to expand
Biological systems are promising substrates for computation because they naturally process environmental information through complex internal dynamics. In this study, we investigate whether bacterial metabolic models can act as physical reservoirs and whether their computational performance can be predicted from dynamical properties linked to separability and similarity. We simulated the growth dynamics of five bacterial species, one yeast species, and 29 Escherichia coli single-gene deletion mutants using dynamic flux balance analysis (dFBA), with glucose and xylose concentrations as inputs and growth curves as reservoir states. Computational performance was assessed on random nonlinear classification tasks using a linear readout, while reservoir properties linked to separability and similarity were characterised through kernel and generalisation ranks computed from growth-curve state matrices. Several microbial models achieved high classification accuracy, showing that bacterial metabolic dynamics can support nonlinear computation. Clear differences were observed between species, with some models converging more rapidly and others reaching higher maximum accuracy, revealing a trade-off between convergence speed and peak performance. In contrast, all E. coli mutants were dominated by the wild-type model, suggesting that gene deletions reduce the dynamical richness required for efficient computation. The difference between kernel and generalisation ranks was generally associated with improved accuracy, but deviations across models and sensitivity at low rank values limited its predictive power in practice. Overall, these results show that bacterial metabolic models constitute promising substrates for reservoir computing and provide a first step towards identifying microbial strains with favourable computational properties for future experimental implementations.
0
0
math.DS 2026-04-20

Rescaling absorbs kinetic noise to stabilize biochemical waves

by Chathranee Jayathilaka, Mark B. Flegg

Mathematical modeling of biochemical signal propagation in many-stage enzymatic pathways

A reciprocal-velocity coordinate change smooths propagation speeds and preserves profiles across enzymatic pathways with varying kinetics.

abstract click to expand
Biochemical signalling cascades transduce extracellular stimuli into cellular responses through sequences of discrete, node-to-node activations. While signal fidelity depends critically on local interaction kinetics, the mechanisms governing information propagation in realistic, highly variable kinetic contexts remain poorly understood. In this paper, we develop a mathematical framework for travelling waves in canonical feed-forward pathways governed by nonlinear Michaelis-Menten-type kinetics. For uniform pathways, we characterise the complete steady-state landscape and demonstrate that activation bias (the contribution of the binary states of each node to downstream activation) between connected nodes acts as a key bifurcation parameter dictating wave existence. Extending this framework to heterogeneous networks, we show how parameter gradients and random kinetic variations distort wavefronts and induce heavy fluctuations in propagation speed. To recover predictable signal transmission, we introduce a novel reciprocal-velocity spatial rescaling technique. We demonstrate that this coordinate transformation inherently absorbs local kinetic variations, effectively smoothing wave velocities and preserving wavefront profiles without requiring bespoke parameter tuning or continuous limits. Finally, by testing the framework's limits against extreme parameter variability, we reveal how severe kinetic bottlenecks lead to functional pathway fragmentation, offering a mathematically justified basis for rational model reduction in complex biochemical networks.
0
0
q-bio.MN 2026-04-17

Four or more distinct rows block non-vacuous ACR in zero-one networks

by Xinyi Si, Xiaoxian Tang

Absolute Concentration Robustness of Non-Redundant Zero-One Networks with Conservation Laws

Conservation laws in low-dimensional non-redundant networks prevent fixed species concentrations at steady states for generic rates.

abstract click to expand
Absolute concentration robustness (ACR) means the concentration of certain species stays the same in all the steady states. In this work, we study how conservation laws might effect non-vacuous ACR in reaction networks. The goal is to show whether non-vacuous ACR can be preserved or precluded by adding species that depend on the existing species. We have the following two main results. (i) For networks with conservation laws, we prove a criterion: for a nondegenerate network, augmenting it with one new species that depends on the original species leads to the resulting network having no non-vacuous ACR for any generic choice of rate constants in the new species. (ii) We characterize all non-redundant zero-one networks with dimension of at most two that exhibit non-vacuous ACR for any generic choice of rate constants according to the number of distinct rows in the stoichiometric matrices. An important finding is that if there are at least four distinct rows in the stoichiometric matrix, then the corresponding network has no non-vacuous ACR for any generic choice of rate constants, which implies that many conservation laws prevent non-vacuous ACR in non-redundant zero-one reaction networks.
0
0
q-bio.MN 2026-04-13

Propagation computes exact Shapley values for acyclic gene networks

by Giang Pham, Silvia Giulia Galfrè +1 more

Efficient Shapley values computation for Boolean network models of gene regulation

The method yields good approximations for cyclic networks and recovers correct importance rankings on Cell Collective models with large time

Figure from the paper full image
abstract click to expand
Identifying dynamically influential nodes in biological networks is a central problem in systems biology, particularly for prioritizing intervention targets in gene regulatory networks. In this paper, we propose a Shapley-value-based framework for assessing the importance of nodes in a Boolean network with respect to a given target node. The framework comprises two complementary measures: the Knock-out and the Knock-in Shapley values. Moreover, we present a propagation-based method that enables their efficient computation. By exploiting the logical structure of the network, the method avoids exhaustive simulations. The approach is exact for acyclic networks and provides good approximations for cyclic networks. Evaluation on benchmark models from the Cell Collective database shows that the propagation method accurately recovers node importance rankings while achieving substantial speed-ups.
0
0
q-bio.MN 2026-04-09 1 theorem

AND-gate wiring diagrams fix the exact number of stable states

by Alan Veliz-Cuba, Zeyu Wang

A modular approach to achieve multistationarity using AND-gates

Combinatorial counting from the interaction graph lets designers build gene networks with any chosen count of phenotypes.

Figure from the paper full image
abstract click to expand
Systems of differential equations have been used to model biological systems such as gene and neural networks. A problem of particular interest is to understand the number of stable steady states. Here we propose conjunctive networks (systems of differential equations equations created using AND gates) to achieve any desired number of stable steady states. Our approach uses combinatorial tools to predict the number of stable steady states from the structure of the wiring diagram. Furthermore, AND gates have been successfully engineered by experimentalists for gene networks, so our results provide a modular approach to design gene networks that achieve arbitrary number of phenotypes.
0
0
cs.LG 2026-04-09 2 theorems

Context conditioning enables accurate predictions on targets with only 67 examples

by Bryan Cheng, Jasper Zhang

When Does Context Help? A Systematic Study of Target-Conditional Molecular Property Prediction

Tests across ten protein families show FiLM fusion outperforms other methods by up to 24 points and temporal splits remain stable at 0.843.

Figure from the paper full image
abstract click to expand
We present the first systematic study of when target context helps molecular property prediction, evaluating context conditioning across 10 diverse protein families, 4 fusion architectures, data regimes spanning 67-9,409 training compounds, and both temporal and random evaluation splits. Using NestDrug, a FiLM-based architecture that conditions molecular representations on target identity, we characterize both success and failure modes with three principal findings. First, fusion architecture dominates: FiLM outperforms concatenation by 24.2 percentage points and additive conditioning by 8.6 pp; how you incorporate context matters more than whether you include it. Second, context enables otherwise impossible predictions: on data-scarce CYP3A4 (67 training compounds), multi-task transfer achieves 0.686 AUC where per-target Random Forest collapses to 0.238. Third, context can systematically hurt: distribution mismatch causes 10.2 pp degradation on BACE1; few-shot adaptation consistently underperforms zero-shot. Beyond methodology, we expose fundamental flaws in standard benchmarking: 1-nearest-neighbor Tanimoto achieves 0.991 AUC on DUD-E without any learning, and 50% of actives leak from training data, rendering absolute performance metrics meaningless. Our temporal split evaluation (train up to 2020, test 2021-2024) achieves stable 0.843 AUC with no degradation, providing the first rigorous evidence that context-conditional molecular representations generalize to future chemical space.
0
0
cs.LG 2026-04-03 1 theorem

External baseline rescues TF signatures for 59 of 61 factors in pooled screen

by Arka Jain, Umesh Sharma

Re-analysis of the Human Transcription Factor Atlas Recovers TF-Specific Signatures from Pooled Single-Cell Screens with Missing Controls

Re-analysis of the human TF Atlas uses embryoid body cells to subtract artifacts and recover many more specific transcriptional effects than

Figure from the paper full image
abstract click to expand
Public pooled single-cell perturbation atlases are valuable resources for studying transcription factor (TF) function, but downstream re-analysis can be limited by incomplete deposited metadata and missing internal controls. Here we re-analyze the human TF Atlas dataset (GSE216481), a MORF-based pooled overexpression screen spanning 3,550 TF open reading frames and 254,519 cells, with a reproducible pipeline for quality control, MORF barcode demultiplexing, per-TF differential expression, and functional enrichment. From 77,018 cells in the pooled screen, we assign 60,997 (79.2\%) to 87 TF identities. Because the deposited barcode mapping lacks the GFP and mCherry negative controls present in the original library, we use embryoid body (EB) cells as an external baseline and remove shared batch/transduction artifacts by background subtraction. This strategy recovers TF-specific signatures for 59 of 61 testable TFs, compared with 27 detected by one-vs-rest alone, showing that robust TF-level signal can be rescued despite missing intra-pool controls. HOPX, MAZ, PAX6, FOS, and FEZF2 emerge as the strongest transcriptional remodelers, while per-TF enrichment links FEZF2 to regulation of differentiation, EGR1 to Hippo and cardiac programs, FOS to focal adhesion, and NFIC to collagen biosynthesis. Condition-level analyses reveal convergent Wnt, neurogenic, EMT, and Hippo signatures, and Harmony indicates minimal confounding batch effects across pooled replicates. Our per-TF effect sizes significantly agree with Joung et al.'s published rankings (Spearman $\rho = -0.316$, $p = 0.013$; negative because lower rank indicates stronger effect). Together, these results show that the deposited TF Atlas data can support validated TF-specific transcriptional and pathway analyses when paired with principled external controls, artifact removal, and reproducible computation.
0
0
q-bio.MN 2026-03-30 Recognition

Dynamic graphs frame multicellular gene control from general principles

by Kyle R. Allison

Control of genes by self-organizing multicellular interaction networks

Representing cell interactions this way produces propositions for self-organization that apply across biological systems.

Figure from the paper full image
abstract click to expand
Multicellular self-organization drives development in biological organisms, yet a comprehensive theory is lacking as basic properties of cells can complicate common approaches. Framing such properties by dynamic graphs led to new theoretical propositions for multicellular self-organization in Escherichia coli. Here, corresponding ideas are developed from biologically-general first principles. The resulting perspective could aid both experimental and computational approaches to multicellular biology as well as efforts to control and engineer it.
0
0
q-bio.MN 2026-03-13 2 theorems

DNA circuit adds 17 trits in ternary base

by Enqiang Zhu, Peize Qiu +3 more

DNA Ternary Full Adder

Competitive blocking circuit recognizes every three-input ternary triple and returns correct sum and carry digits in biochemical tests.

Figure from the paper full image
abstract click to expand
As transistor dimensions continue to shrink, binary devices are rapidly approaching their fundamental limits in power density. In response, multi-valued systems have attracted significant attention due to their enhanced information density. Among these, the ternary system stands out as the most practical option, being the closest integer base to (e), which is considered optimal for information efficiency. Despite the intrinsic advantages of DNA nanomaterials, such as programmability, energy efficiency, and massive parallelism, their application in ternary logic remains largely unexplored, particularly in the realm of ternary addition circuits. This gap can be attributed to a fundamental challenge: ternary logic requires circuits capable of recognizing and processing a far larger set of input combinations than binary systems, a task that existing models and techniques often struggle to accomplish. In this work, we propose a novel architecture for a ternary full adder. Our design includes a competitive blocking (CB) circuit that enables the recognition and computation of all possible three-input ternary combinations. Coupled with a dynamic concentration adjustment (CA) strategy, this approach significantly enhances the number of trits that can be processed. Biochemical experiments demonstrate that the CB circuit successfully yields the correct output digits for a ternary full adder, achieving 17-trit ternary addition. To our knowledge, this work represents the first successful DNA-based ternary adder, establishing a new methodological foundation for DNA computing and highlighting its considerable potential for scalable digital information processing.
0
0
q-bio.MN 2026-03-09 Recognition

CRN generalization of next generation matrix aids ODE stability

by Florin Avram, Rim Adenane +1 more

A cocktail of chemical reaction networks and mathematical epidemiology tools for positive ODE stability problems

The approach proves stability for positive systems in epidemiology and chemistry by structural properties of reaction networks.

Figure from the paper full image
abstract click to expand
We continue recent attempts to put together concepts and results of Chemical Reaction Networks theory (CRNT) and Mathematical Epidemiology (ME), for solving problems of stability of positive ODEs. We provide first an elegant CRN-flavored generalization of the most cited result in ME, the Next Generation Matrix (NGM) theorem. We review next the "symbolic-numeric approach of Vassena and Stadler, which tackles bifurcation problems by viewing the characteristic polynomial of the Jacobian at fixed points as a formal polynomial in the "symbolic reactivities", and identifies its coefficients as "Child Selection minors of the stoichiometric matrix". We also review two applications of this approach using the Mathematica package Epid-CRN tools from both CRNT and ME.
0
0
cond-mat.stat-mech 2026-03-05 2 theorems

Ising model reduces thin-filament cooperativity to two parameters

by Elaheh Saadat, Matthieu Caruel +7 more

Ising Models of Cooperativity in Muscle Contraction

Calcium and myosin force fix the activation spread at two to seven extra actin monomers per regulatory unit and fit data with and without Om

Figure from the paper full image
abstract click to expand
Regulation of contraction in striated muscle is controlled by a dual mechanism involving both thin filaments containing actin and thick filaments containing myosin. The thin filament is activated by calcium ions binding to troponin, leading to tropomyosin azimuthal displacement which allows the activation of a regulatory unit (composed of one troponin, one tropomyosin and seven actin monomers) that exposes the actin sites for interaction with the myosin motors. Motor attachment to actin contributes to spreading activation within and beyond a regulatory unit along the thin filament through a cooperative mechanism. We introduce a one-dimensional Ising model to elucidate the mechanism of cooperativity in thin filament activation in relation to the force generated by the attached myosin motor. The model characterizes thin filament activation and cooperativity using only two parameters: one related to calcium concentration and the other to the force exerted by the attached myosin motor, which is modulated by temperature. At any force, the model is able to determine the extent of actin-myosin interactions on a correlation length ranging from two to seven actin monomers in addition to the seven actin monomers of the regulatory unit. Our theoretical predictions are successfully tested on experimental data, and our tests also include the condition of hindered filament activation by the use of the specific drug Omecamtiv Mecarbil (OM). According to our model, the effect of OM results in an anti-cooperativity mechanism accounting for the experimental data.
0
0
q-bio.MN 2026-02-19 Recognition

Gene networks need minimum size to oscillate amid noise

by Manuel Eduardo Hernández-García, Jorge Velázquez-Castro

Oscillation Criteria in Large-Scale Gene Regulatory Networks with Intrinsic Fluctuations

Second-moment equations show the scale at which molecular fluctuations stop suppressing cycles in feedback loops.

abstract click to expand
Gene Regulatory Networks(GRNs) with feedback are essential components of many cellular processes and may exhibit oscillatory behavior. Analyzing such systems becomes increasingly complex as the number of components increases. Since gene regulation often involves a small number of molecules, fluctuations are inevitable. Therefore, it is important to understand how fluctuations affect the oscillatory dynamics of cellular processes, as this will allow comprehension of the mechanisms that enable cellular functions to remain even in the presence of fluctuations or, failing that, to determine the limit of fluctuations that permits various cellular functions. In this study, we investigated the conditions under which GRNs with feedback and intrinsic fluctuations exhibit oscillatory behavior. Our focus was on developing a procedure that would be both manageable and practical, even for extensive regulatory networks, that is, those comprising numerous nodes. Using the second-moment approach, we described the stochastic dynamics through a set of ordinary differential equations for the mean concentration and its second central moment. The system can attain either a stable equilibrium or oscillatory behavior, depending on its scale and, consequently, the intensity of fluctuations. To illustrate the procedure, we analyzed two relevant systems: a repressilator with three nodes and a system with five nodes, both incorporating intrinsic fluctuations. In both cases, it was observed that for very small systems, which therefore exhibit significant fluctuations, oscillatory behavior is inhibited. The procedure presented here for analyzing the stability of oscillations under fluctuations enables the determination of the critical minimum size of GRNs at which intrinsic fluctuations do not eliminate their cyclical behavior.
0
0
q-bio.MN 2026-02-02 2 theorems

RAG-GNN lifts protein clustering silhouette score by 0.093

by Hasi Hays, William J. Richardson

RAG-GNN: Integrating Retrieved Knowledge with Graph Neural Networks for Precision Medicine

Literature retrieval fused into graph networks yields more accurate functional groupings than topology alone in cancer pathways.

Figure from the paper full image
abstract click to expand
Network topology excels at structural predictions but fails to capture functional semantics encoded in biomedical literature. We present RAG-GNN, an end-to-end trainable retrieval-augmented graph neural network framework that integrates GNN representations with dynamically retrieved literature-derived knowledge through a jointly optimized retrieval projection, gated fusion mechanism, and contrastive alignment. In a cancer signaling case study (379 proteins, 3,498 interactions, 14 functional categories), RAG-GNN improves functional clustering from silhouette $= -0.237 \pm 0.065$ (GNN-only) to $-0.144 \pm 0.066$, a consistent improvement of $+0.093 \pm 0.022$ across 10 random seeds, while the learned retrieval achieves mean precision@10 $= 0.242$, a 152\% improvement over the random baseline ($0.096$). Heuristic information decomposition with bootstrap confidence intervals reveals that topology and retrieval encode overwhelmingly shared information (95.6\%), with retrieval improving both intra-cluster cohesion (silhouette) and cluster agreement (ARI $+0.021 \pm 0.015$). Counterfactual experiments confirm that adversarial, absent, and random retrieval all degrade performance, validating that the gated fusion mechanism depends on document content. Benchmarking against eight established embedding methods demonstrates task-specific complementarity: topology-focused methods achieve strong link prediction, while retrieval augmentation consistently improves functional clustering within the controlled GNN-only ablation. DDR1 subnetwork analysis provides confirmatory validation consistent with established synthetic lethality relationships. These results establish that topology-only and retrieval-augmented approaches serve complementary purposes for precision medicine applications.
0
0
physics.hist-ph 2026-01-05 2 theorems

Assembly theory measures causation to define life

by Leroy Cronin, Sara I. Walker

The Physics of Causation

The minimum recursive steps to build objects, tracked by copy numbers, set a physical threshold separating living structures from non-living

abstract click to expand
Assembly theory (AT) introduces causation as a material property and establishes a metrology for objects produced by evolution and selection. The physical scale of causation is quantified by the assembly index, defined as the minimum number of recursive steps necessary to make an object. Observing countable copies of high assembly index objects indicates a mechanism producing them is persistent, such that the object's environment constructs a memory that traps causation within a contingent chain. Copy number and assembly index together underlie a standardized metrology for detecting causation (assembly index) and contingency (copy number). These allow a precise definition of an assembly threshold that demarcates life (and its derivative agential, intelligent, and technological forms and artifacts) as structures with persistent copies in regimes of deep causal possibility. In introducing a fundamental concept of material causation to quantify and measure life, AT represents a departure from prior theories of causation, such as interventional ones, which have so far proven incompatible with fundamental physics. We discuss how AT's concept of causation provides the foundation for a theory of physics that allows precise and testable concept of "life", and in which novelty, contingency and the potential for open-endedness are fundamental, and determinism is emergent from selection along assembled lineages.
0
0
q-bio.MN 2025-12-31 2 theorems

Epigenetic feedback reshapes the Waddington landscape

by Sascha H. Hauck, Sandip Saha +2 more

Epigenetic feedback reshapes dynamical landscapes in gene regulatory networks

A DMFT model shows how slow epigenetic changes create dynamic potential wells that guide cell state transitions.

Figure from the paper full image
abstract click to expand
Understanding how gene regulatory networks (GRNs) give rise to stable and dynamic cellular states remains a central challenge in theoretical biology, particularly when slow epigenetic feedback reshapes the underlying regulatory landscape. While experimental approaches such as single-cell transcriptomics reveal rich dynamical behaviour, a tractable theoretical framework that links gene expression, epigenetic control, and collective dynamics remains challenging. Here, we develop an extended Dynamical Mean Field Theory (DMFT) framework for GRNs that incorporates epigenetic modifications as slow, feedback-driven variables. Building on the analogy between Hopfield networks and spin glass systems, we derive effective stochastic equations that reduce high-dimensional dynamics to a tractable form across multiple timescales. This formulation enables quantitative characterization of both stable and oscillatory regimes and reveals how epigenetic feedback reshapes the effective potential landscape governing cell fate decisions. Our model shows how epigenetic feedback regulation dynamically reshapes the Waddington landscape. Our results and methodology provide a unified theoretical framework for understanding developmental dynamics and epigenetic reprogramming in complex biological systems.
0
0
physics.bio-ph 2025-12-18 1 theorem

Hodgkin-Huxley model fits plant APs in changing light conditions

by Imen Bekkari, Maurizio Magarini +1 more

Modeling Plant Action Potentials under Photoperiod Stress via Hodgkin-Huxley Dynamics

Voltage-independent rates let the equations capture both slow and rapid electrical responses in tobacco under controlled photoperiods.

Figure from the paper full image
abstract click to expand
Plants exhibit dynamic bioelectric properties that facilitate information transfer across tissues. This study investigates action potentials (APs) in Nicotiana tabacum recorded within a custom-designed growth chamber using a biosignal amplifier and environmental sensors. Consistent light- and dark-induced APs were observed during photoperiod transitions under controlled 12-hour artificial illumination cycles. To understand these bioelectric responses, a mathematical model based on the Hodgkin-Huxley framework is used. Electrophysiological measurements from Solanum lycopersicum revealed that under natural light conditions, only light-induced APs are observed, while light- and dark-induced APs coupled dynamics is exclusively elicited during rapid transitions in artificial photoperiods. These distinct phenomena are characterized as Prolonged Oscillatory Climatic Engagement (POCE) and Nimble Environmental Transition Oscillation (NETO), respectively. The model successfully reproduces the key features in both frameworks while maintaining computational efficiency through voltage-independent rate parameters.
0
0
q-bio.MN 2025-12-08 2 theorems

Enzyme binding shifts diffusion pattern regions in metabolism

by Faezeh Farivar

Enzyme-Substrate Complex Formation Modulates Diffusion-Driven Patterning In Metabolic Pathways

Explicit complex formation in a two-step pathway changes the instability conditions compared to effective kinetics models.

Figure from the paper full image
abstract click to expand
Spatial organization in metabolic pathways can arise from the interplay between enzymatic reaction kinetics and diffusion-driven instabilities. In this work we investigate how reversible enzyme--substrate binding influences pattern formation in a two-step metabolic pathway. Starting from a mechanistic description in which the substrate reversibly binds to the first enzyme before catalytic conversion, we formulate a three-species reaction--diffusion system that explicitly incorporates the enzyme--substrate complex. We first analyse the homogeneous dynamics and determine the unique steady state of the kinetic system. Exploiting the separation of time scales between the rapid binding kinetics and the slower evolution of metabolite concentrations, we derive a reduced two-variable model using a quasi-steady-state approximation for the enzyme-substrate complex. This reduction preserves the essential nonlinear coupling between catalytic reactions and spatial transport. Linear stability and weakly nonlinear analysis reveal conditions for diffusion-driven (Turing) instability and show that reversible enzyme binding significantly modifies the location and extent of the instability region compared to models with effective kinetics. Numerical simulations confirm the analytical predictions and demonstrate how enzyme-substrate interactions reshape pattern selection and slow the emergence of spatial heterogeneity. These results provide a mechanistic link between enzyme binding kinetics, diffusion-driven pattern formation, and mesoscale metabolic organization. The proposed framework offers a tractable approach for studying spatial patterning in enzymatic networks and may help explain the emergence of structured biochemical domains such as those associated with liquid--liquid phase separation.
0
0
q-bio.MN 2025-11-25 2 theorems

Graph conditions enable enumeration of autocatalytic subnetworks in large CRNs

by Richard Golnik, Thomas Gatter +2 more

Enumeration of Autocatalytic Subsystems in Large Chemical Reaction Networks

The method locates self-maintaining subsystems and their minimal cores in full metabolic models of E. coli and other organisms.

abstract click to expand
Autocatalysis is an important feature of metabolic networks, contributing crucially to the self-maintenance of organisms. Autocatalytic subsystems of chemical reaction networks (CRNs) are characterized in terms of algebraic conditions on submatrices of the stoichiometric matrix. Here, we derive sufficient conditions for subgraphs supporting irreducible autocatalytic systems in the bipartite K\H{o}nig representation of the CRN. On this basis, we develop an efficient algorithm to enumerate autocatalytic subnetworks and, as a special case, autocatalytic cores, i.e., minimal autocatalytic subnetworks, in full-size metabolic networks. The same algorithmic approach can also be used to determine autocatalytic cores only. As a showcase application, we provide a complete analysis of autocatalysis in the core metabolism of E. coli and enumerate irreducible autocatalytic subsystems of limited size in full-fledged metabolic networks of E. coli, human erythrocytes, and Methanosarcina barkeri (Archea). The mathematical and algorithmic results are accompanied by software enabling the routine analysis of autocatalysis in large CRNs.
0
0
cs.LG 2025-11-07 2 theorems

Spectral interpolation improves rare molecular property predictions

by Brenda Nogueira, Gisela A. Gonzalez-Montiel +3 more

SPECTRA: Spectral Domain-Aware Graph Generation for Imbalanced Molecular Property Regression

By aligning target-neighbor graphs and interpolating Laplacian spectra, SPECTRA focuses generation on scarce ranges while cutting compute by

Figure from the paper full image
abstract click to expand
Molecular property regression struggles with cases in chemically relevant target ranges that are underrepresented in datasets. Standard average error minimization approaches underperform in these highly relevant cases, and oversampling approaches lead to meaningless molecular representations. In this paper, we propose SPECTRA, a spectral, domain-aware graph generation method designed to improve the prediction of underrepresented but relevant molecular property values. It combines a rarity-aware budgeting scheme to focus generation where data are scarce, target-neighbors graph alignment to establish structural correspondence, and interpolation of Laplacian spectra, node features, and targets. Coupled with spectral GNN using edge-aware Chebyshev convolutions, SPECTRA shows its effectiveness in property prediction benchmarks with competitive performance over leading state-of-the-art methods in relevant target ranges, while requiring ~4x less computational time.
0
0
q-bio.MN 2025-11-06 Recognition

Gene ranking raises metabolic design success rates 37-186%

by Yier Ma, Takeyuki Tamura

A Gene Ranking Framework Enhances the Design Efficiency of Genome-Scale Constraint-Based Metabolic Networks under Time Limits

Pre-assigning top genes and solving smaller subproblems in parallel finds more growth-coupled solutions before time expires.

Figure from the paper full image
abstract click to expand
The design of genome-scale constraint-based metabolic networks has steadily advanced, with an increasing number of successful cases achieving growth-coupled production, in which the biosynthesis of key metabolites is linked to cell growth. However, a major cause of design failures is the inability to find solutions within realistic time limits. Therefore, it is essential to develop methods that achieve a high success rate within the specified computation time. In this study, we propose a framework for ranking the importance of individual genes to accelerate the solution of the original mixed-integer linear programming (MILP) problems in the design of constraint-based models. In the proposed method, after pre-assigning values to highly important genes, the MILPs are solved in parallel as a series of mutually exclusive subproblems. It is found that our framework was able to recover most of the successful cases identified by the original approach and achieved a 37% to 186% increase in success rate compared to the original method within the same time limits. Analysis of the MILP solution process revealed that the proposed method reduced the sizes of subproblems and decreased the number of nodes in the branch-and-bound tree. This framework for ranking gene importance can be directly applicable to a range of MILP-based algorithms for the design of constraint-based metabolic networks. The developed scripts are available on \href{https://github.com/MetNetComp/Gene-Ranked-RatGene}{https://github.com/MetNetComp/Gene-Ranked-RatGene}.
0
0
cond-mat.stat-mech 2025-10-10 2 theorems

Bacterial transitions lack single rate outside small-noise limit

by Jianzhe Wei, Jingwen Zhu +3 more

Cell State Transitions Beyond the Small-Noise Limit

Tracking over 1000 cells shows multiplicative noise slows switches and questions discrete-state models.

Figure from the paper full image
abstract click to expand
State transitions are fundamental in biological systems but challenging to observe directly. Here, we present the first single-cell observation of state transitions in a synthetic bacterial genetic circuit. Using a mother machine, we tracked over 1007 cells for 27 hours. First-passage analysis and dynamical reconstruction reveal that transitions occur outside the small-noise regime, challenging the applicability of classical Kramers' theory. The process lacks a single characteristic rate, questioning the paradigm of transitions between discrete cell states. We observe significant multiplicative noise that distorts the effective potential landscape yet increases transition times. These findings necessitate theoretical frameworks for biological state transitions beyond the small-noise assumption.
0
0
q-bio.MN 2025-07-03 1 theorem

Algorithm simulates exact bursty gene networks at low cost

by Mathilde Gaillard, Ulysse Herbach

Efficient stochastic simulation of gene regulatory networks using hybrid models of transcriptional bursting

It shows bimodal expression arises from interaction-driven burst-frequency differences, not bursting alone.

Figure from the paper full image
abstract click to expand
Single-cell data reveal the presence of biological stochasticity between cells of identical genome and environment, in particular highlighting the transcriptional bursting phenomenon. To account for this property, gene expression may be modeled as a continuous-time Markov chain where biochemical species are described in a discrete way, leading to Gillespie's stochastic simulation algorithm (SSA) which turns out to be computationally expensive for realistic mRNA and protein copy numbers. Alternatively, hybrid models based on piecewise-deterministic Markov processes (PDMPs) offer an effective compromise for capturing cell-to-cell variability, but their simulation remains limited to specialized mathematical communities. With a view to making them more accessible, we present here a simple simulation method that is reminiscent of SSA, while allowing for much lower computational cost. We detail the algorithm for a bursty PDMP describing an arbitrary number of interacting genes, and prove that it simulates exact trajectories of the model. As an illustration, we use the algorithm to simulate a two-gene toggle switch: this example highlights the fact that bimodal distributions as observed in real data are not explained by transcriptional bursting per se, but rather by distinct burst frequencies that may emerge from interactions between genes.
0
0
cs.LG 2025-06-27 2 theorems

Dual diffusion models align unpaired single-cell perturbations

by Changxi Chi, Jun Xia +10 more

Doloris: Dual Conditional Diffusion Implicit Bridges with Sparsity Masking Strategy for Unpaired Single-Cell Perturbation Estimation

Shared latent space plus zero masking lets the model predict diverse responses without cell pairing for faster drug screening

Figure from the paper full image
abstract click to expand
Estimating single-cell responses across various perturbations facilitates the identification of key genes and enhances drug screening, significantly boosting experimental efficiency. However, single-cell sequencing is a destructive process, making it impossible to capture the same cell's phenotype before and after perturbation. Consequently, data collected under perturbed and unperturbed conditions are inherently unpaired, creating a critical yet unresolved problem in single-cell perturbation modeling. Moreover, the high dimensionality and sparsity of single-cell expression make direct modeling prone to focusing on zeros and neglecting meaningful patterns. To address these problems, we propose a new paradigm for single-cell perturbation modeling. Specifically, we leverage dual diffusion models to learn the control and perturbed distributions separately, and implicitly align them through a shared Gaussian latent space, without requiring explicit cell pairing. Furthermore, we introduce a sparsity masking strategy in which the mask model learns to predict zero-expressed genes, allowing the diffusion model to focus on capturing meaningful patterns among expressed genes and thereby preserving diversity in high-dimensional sparse data. We introduce \textbf{Doloris}, a generative framework that defines a new paradigm for modeling unpaired, high-dimensional, and sparse single-cell perturbation data. It leverages dual conditional diffusion models for separate learning of control and perturbed distributions, complemented by a sparsity masking strategy to enhance prediction of zero-valued genes. The results on publicly available datasets show that our model effectively captures the diversity of single-cell perturbations and achieves state-of-the-art performance. To facilitate reproducibility, we include the code in the supplementary materials.
0
0
q-bio.QM 2025-06-23 2 theorems

Mass spectrometry now quantifies thousands of proteins in single cells

by Nikolai Slavov

Single-Cell Proteomic Technologies: Tools in the quest for principles

Review projects room for scaling throughput and extending to functional measurements that could support biophysical models of cells.

abstract click to expand
Over the last decade, proteomic analysis of single cells by mass spectrometry transitioned from an uncertain possibility to a set of robust and rapidly advancing technologies supporting the accurate quantification of thousands of proteins. We review the major drivers of this progress, from establishing feasibility to powerful and increasingly scalable methods. We focus on the tradeoffs and synergies of different technological solutions within a coherent conceptual framework, which projects considerable room both for throughput scaling and for extending the analysis scope to functional protein measurements. We highlight the potential of these technologies to support the development of mechanistic biophysical models and help uncover new principles.
0
0
q-bio.MN 2025-06-11 2 theorems

GPUs speed up logic model searches for gene networks up to 19 times

by Joyce Reimer, Pranta Saha +5 more

GPU-accelerated Modeling of Biological Regulatory Networks

Speedups of 33-1866% over CPU methods make fitting regulatory models to data practical for larger biological systems.

Figure from the paper full image
abstract click to expand
The complex regulatory dynamics of a biological network can be succinctly captured using discrete logic models. Given even sparse time-course data from the system of interest, previous work has shown that global optimization schemes are suitable for proposing logic models that explain the data and make predictions about how the system will behave under varying conditions. Considering the large scale of the parameter search spaces associated with these regulatory systems, performance optimizations on the level of both hardware and software are necessary for making this a practical tool for in silico pharmaceutical research. We show here how the implementation of these global optimization algorithms in a GPU-computing environment can accelerate the solution of these parameter search problems considerably. We carry out parameter searches on two model biological regulatory systems that represent almost an order of magnitude scale-up in complexity, and we find the gains in efficiency from GPU to be a 33%-43% improvement compared to multi-thread CPU implementations and a 33%-1866% increase compared to CPU in serial. These improvements make global optimization of logic model identification a far more attractive and feasible method for in silico hypothesis generation and design of experiments.
0
0
math.PR 2025-05-13 Recognition

Reaction rates can be non-identifiable from SDE laws

by Louis Faul, Linard Hoessly +1 more

Identifiability of SDEs for reaction networks

Some networks allow multiple rate choices or different graphs to produce identical diffusion laws even when the full law is known.

abstract click to expand
Biochemical reaction networks are widely applied across scientific disciplines to model complex dynamic systems. We investigate the diffusion approximation of reaction networks with mass-action kinetics, focusing on the identifiability of the stochastic differential equations associated to the reaction network. We derive conditions under which the law of the diffusion approximation is identifiable and provide theorems for verifying identifiability in practice. Notably, our results show that some reaction networks have non-identifiable reaction rates, even when the law of the corresponding stochastic process is completely known. Moreover, we show that reaction networks with distinct graphical structures can generate the same diffusion law under specific choices of reaction rates. Finally, we compare our framework with identifiability results in the deterministic ODE setting and the discrete continuous-time Markov chain models for reaction networks.
1 0
0
q-bio.QM 2025-04-22 2 theorems

Tweaked LNA models non-linear population dynamics long-term

by Frederick Truman-Williams, Giorgos Minas

Simulating stochastic population dynamics: The Linear Noise Approximation can capture non-linear phenomena

Centre manifold adjustments keep the fast approximation accurate for oscillations and bistability.

Figure from the paper full image
abstract click to expand
Population dynamics in fields such as molecular biology, epidemiology, and ecology exhibit highly stochastic and non-linear behaviour. In gene regulatory systems in particular, oscillations and multi-stability are especially common. Despite this, none of the currently available stochastic models for population dynamics are both accurate and computationally efficient for long-term predictions. A prominent model in this field, the Linear Noise Approximation (LNA), is computationally efficient for tasks such as simulation, sensitivity analysis, and parameter estimation; however, it is only accurate for linear systems and short-time predictions. Other models may achieve greater accuracy across a broader range of systems, but they sacrifice computational efficiency and analytical tractability. This paper demonstrates that, with specific modifications, the LNA can accurately capture non-linear dynamics in population processes. We introduce a new framework based on centre manifold theory, a classical concept from non-linear dynamical systems. This approach enables the identification of simple, system-specific modifications to the LNA, tailored to classes of qualitatively similar non-linear dynamical systems. With these modifications, the LNA can achieve accurate long-term simulations without compromising computational efficiency. We apply our methodology to classes of oscillatory and bi-stable systems, and present multiple examples from molecular population dynamics that demonstrate accurate long-term simulations alongside significant improvements in computational efficiency.
0
0
quant-ph 2025-04-14 1 theorem

Grover search recovers Boolean logic in 5-protein brain network

by Aspen Erlandsson Brisebois, Jason Broderick +4 more

Identifying Protein Co-regulatory Network Logic by Solving B-SAT Problems through Gate-based Quantum Computing

Sparse expression data yields accurate models on both simulators and real NISQ hardware for cortical development circuit

Figure from the paper full image
abstract click to expand
There is growing awareness that the success of pharmacologic interventions on living organisms is significantly impacted by context and timing of exposure. In turn, this complexity has led to an increased focus on regulatory network dynamics in biology and our ability to represent them in a high-fidelity way, in silico. Logic network models show great promise here and their parameter estimation can be formulated as a constraint satisfaction problem (CSP) that is well-suited to the often sparse, incomplete data in biology. Unfortunately, even in the case of Boolean logic, the combinatorial complexity of these problems grows rapidly, challenging the creation of models at physiologically-relevant scales. That said, quantum computing, while still nascent, facilitates novel information-processing paradigms with the potential for transformative impact in problems such as this one. In this work, we take a first step at actualizing this potential by identifying the structure and Boolean decisional logic of a well-studied network linking 5 proteins involved in the neural development of the mammalian cortical area of the brain. We identify the protein-protein connectivity and binary decisional logic governing this network by formulating it as a Boolean Satisfiability (B-SAT) problem. We employ Grover's algorithm to solve the NP-hard problem faster than the exponential time complexity required by deterministic classical algorithms. Using approaches deployed on both quantum simulators and actual noisy intermediate scale quantum (NISQ) hardware, we accurately recover several high-likelihood models from very sparse protein expression data. The results highlight the differential roles of data types in supporting accurate models; the impact of quantum algorithm design as it pertains to the mutability of quantum hardware; and the opportunities for accelerated discovery enabled by this approach.
1 0
0
q-bio.CB 2025-01-21 2 theorems

Waddington vector field adds gene noise and cell signals to tissue models

by Casey O. Barkan, Tom Chou

Incorporating stochastic gene expression, signaling-mediated intercellular interactions, and regulated cell proliferation in models of coordinated tissue development

The framework links an epigenetic fitness landscape to proliferation rates while allowing cycles and entropy production in gene dynamics.

Figure from the paper full image
abstract click to expand
Formulating quantitative and predictive models for tissue development requires consideration of the complex, stochastic gene expression dynamics, its regulation via cell-to-cell interactions, and cell proliferation. Including all of these processes into a practical mathematical framework requires complex expressions that are difficult to interpret and apply. We construct a simple theory that incorporates intracellular stochastic gene expression dynamics, signaling chemicals that influence these dynamics and mediate cell-cell interactions, and cell proliferation and its accompanying differentiation. Cellular states (genetic and epigenetic) are described by a Waddington vector field that allows for non-gradient dynamics (cycles, entropy production, loss of detailed balance) which is precluded in Waddington potential landscape representations of gene expression dynamics. We define an epigenetic fitness landscape that describes the proliferation of different cell types, and elucidate how this fitness landscape is related to Waddington's vector field. We illustrate the applicability of our framework by analyzing two model systems: an interacting two-gene differentiation process and a spatiotemporal organism model inspired by planaria.
1 0
0
q-bio.MN 2024-12-24 2 theorems

Reaction networks with power-law kinetics admit an ideal-theoretic characterization of…

by Elisenda Feliu, Oskar Henriksson +1 more

The generic geometry of steady state varieties

Ideal theory applied to vertically parametrized systems shows when steady-state concentrations remain fixed for generic parameters.

abstract click to expand
We answer several fundamental geometric questions about reaction networks with power-law kinetics, on topics such as generic finiteness of the number of steady states, robustness, and nondegenerate multistationarity. In particular, we give an ideal-theoretic characterization of generic absolute concentration robustness, as well as conditions under which a network that admits multiple steady states also has the capacity for nondegenerate multistationarity. The key tools underlying our results come from the theory of vertically parametrized systems, and include a linear algebra condition that characterizes when the steady state system has positive nondegenerate zeros.
2 0
0
physics.bio-ph 2024-10-14 Recognition

Dissipative networks evolve decoupled fuel modules

by Bowen Shi, Long Qian +1 more

Evolutionary origin of the bipartite architecture of dissipative cellular networks

Simulations selecting only for function produce isolated energy modules that raise dissipation and robustness.

Figure from the paper full image
abstract click to expand
Recently, plenty research has been done on discovering the role of energy dissipation in biological networks, most of which focus on the relationship of dissipation and functionality. However, the development of networks science urged us to fathom the systematic architecture of biological networks and their evolutionary advantages. We found the dissipation of biological dissipative networks is highly related to their structure. By interrogating these well-adapted networks, we find that the energy producing module is relatively isolated in all situations. We applied evolutionary simulation and analysis on premature networks of classic dissipative networks, namely kinetic proofreading, activator-inhibitor oscillator and two typical adaptative response models. We found despite that selection was imposed merely on the network function, the networks tended to decouple high energy molecules as fuels from the functional module, to achieve higher overall dissipation during the course of evolution. Furthermore, we find that decoupled fuel modules can increase the robustness of the networks towards parameter or structure perturbations. We provide theoretical analysis on the kinetic proofreading networks and the general case of energy-driven networks. We find fuel decoupling can guarantee higher dissipation and, in most cases when considering dissipative networks, higher performance. We conclude that fuel decoupling is an evolutionary outcome and bears benefits during evolution.
0
0
q-bio.MN 2024-09-11 2 theorems

Network partitions yield simpler equations for equilibria

by Murad Banaji, Elisenda Feliu

Positive equilibria in mass action networks: geometry and bounds

Alternative systems from partitions correspond one-to-one with positive equilibria and expose geometric bounds and multistationarity regions

abstract click to expand
Any mass action network gives rise to a parameterised family of polynomial equations whose positive solutions are the positive equilibria of the network. Here, we consider alternative systems of equations, whose solutions are in smooth, one-to-one correspondence with positive equilibria of the network, and capture degeneracy or nondegeneracy of the corresponding equilibria. The construction leads us to consider partitions of networks in a natural sense, and we explore the implications of choosing different partitions. The alternative systems are in some situations simpler than the original mass action equations, which allows us to rapidly identify various algebraic and geometric properties of the positive equilibrium set. This includes the characterisation of toricity and local toricity, bounds on the number of positive nondegenerate equilibria on stoichiometric classes, semialgebraic descriptions of the parameter regions for multistationarity, and the study of bifurcations. After discussing the construction of the alternative systems, various consequences for particular classes of networks and numerous examples are presented. We also develop additional techniques specifically for quadratic networks, the most common class of networks in applications, and use these techniques to derive strengthened results for quadratic networks.
0
0
q-bio.MN 2024-09-10 2 theorems

Linear Lyapunov function ensures regularity in reaction systems

by Chuang Xu

On the Regulary of Reaction Systems

Checkable condition proves second-order endotactic and bimolecular weakly reversible systems are non-explosive in both stochastic and mean-f

Figure from the paper full image
abstract click to expand
Reaction networks have been widely used as generic models in diverse areas of applied sciences, such as biology, chemistry, ecology, epidemiology, and computer science. A reaction network incorporating noisy effects is modeled as a continuous time Markov chain (CTMC) and is called a stochastic reaction system. In contrast, the mean field limit of a sequence of volume-scaled stochastic reaction systems as the volume tends to infinity is modeled as an ordinary differential equation (ODE) and is called a deterministic reaction system. Non-explosivity of CTMCs and global existence of solutions of ODEs capture the regularity of respective dynamical processes. In this paper, we study the regularity of reaction systems, in both stochastic and deterministic senses. By constructing a simple linear Lyapunov function, we obtain the regularity in both sense for a class of reaction systems in terms of a simple checkable condition. As an application, we prove that (i) every second-order endotactic mass-action system is regular, and hence (ii) every bimolecular weakly reversible mass-action system is regular. We apply our results to diverse models in biochemistry, epidemiology, ecology, synthetic biology, and natural computing in the literature.
1 0
0
q-bio.CB 2024-08-27 2 theorems

Stiffness triggers hierarchical actin phase transitions

by Yuika Ueda, Shinji Deguchi

Hierarchical phase transitions as mechanical checkpoints of intracellular organization

A thermodynamic model shows energy-entropy thresholds act as checkpoints for cytoskeletal order during cell spreading.

abstract click to expand
Living cells inherently reorganize their intracellular structures in response to mechanical cues from their environment. Among these responses, the formation of actin-based stress fibers exhibits a series of structural transitions depending on substrate stiffness: from disordered states on soft substrates, to partial alignment, and eventually to bundled formations as stiffness increases. While these transformations have been well documented in many cell types, the physical principles underlying their emergence remain elusive. Here, we observe identical stiffness-dependent actin reorganizations in senescent fibroblasts despite their diminished biochemical and metabolic activities, suggesting that physical constraints play a dominant role in the phenomenon. We then develop a statistical-mechanical framework to demonstrate that these changes arise through a hierarchy of threshold-dependent phase transitions dictated by energy-entropy competition. This formulation provides a thermodynamic basis for understanding how distinct cytoskeletal orders become favored under different mechanical regimes. We propose that these transitions serve as mechanical checkpoints that coordinate intracellular organization during G1-phase spreading. These findings reveal how mechanical cues guide distinct intracellular orders through a physically constrained hierarchy of transitions.
0
0
q-bio.MN 2024-06-06 2 theorems

Chemical neuron networks approximate any dynamics

by Alexander Dack, Benjamin Qureshi +2 more

Recurrent neural chemical reaction networks that approximate arbitrary dynamics

Modular architecture lets researchers train reaction networks to produce oscillations, chaos, and other target behaviors.

Figure from the paper full image
abstract click to expand
Many important phenomena in biochemistry and biology exploit dynamical features such as multi-stability, oscillations, and chaos. Construction of novel chemical systems with such rich dynamics is a challenging problem central to the fields of synthetic biology and molecular nanotechnology. In this paper, we address this problem by putting forward a molecular version of a recurrent artificial neural network, which we call recurrent neural chemical reaction network (RNCRN). The RNCRN uses a modular architecture - a network of chemical neurons - to approximate arbitrary dynamics. We first prove that with sufficiently many chemical neurons and suitably fast reactions, the RNCRN can be systematically trained to achieve any dynamics. RNCRNs with relatively small number of chemical neurons and a moderate range of reaction rates are then trained to display a variety of biologically-important dynamical features. We also demonstrate that such RNCRNs are experimentally implementable with DNA-strand-displacement technologies.
0

browse all of q-bio.MN → full archive · search · sub-categories