cs.FL — Pith

0

cs.FL 2026-07-03

Constructible words characterize rational languages on scattered orderings

by Thomas Braipson, Tom Clara

Constructible Words Characterize Rational Languages of Words Indexed by Scattered Linear Orderings

Finite notations suffice to decide membership for all words indexed by scattered orderings.

abstract click to expand

Automata on linear orderings are finite-state automata introduced by Bruy\`ere and Carton as a broad generalization of finite, infinite and transfinite-word automata. In this context, a word is defined as a function from a linear ordering to a finite alphabet. This general definition can make automata on linear orderings difficult to reason about. In this work, we introduce constructible words as an intuitive way of tackling this difficulty. These words can be obtained by a finite number of applications of simple operators and thus admit a finite notation. We show that a rational language of words indexed by scattered (countable and uncountable) linear orderings is characterized by its constructible words. Our proof of this result relies on an interesting theorem of semigroup theory due to Colcombet. We expect this property to be useful in future theoretical developments about automata on scattered linear orderings.

0

cs.FL 2026-07-03

Algebraic kernel polynomials give exact runtime asymptotics

by Michele Boreale, Luisa Collodi +1 more

On the algebraic analysis of runtime distribution of probabilistic programs

For GCP programs the generating function Δ(z) is algebraic via kernel roots, so dominant singularities yield precise expansions and bounds f

abstract click to expand

We present an algebraic method for analyzing probabilistic programs with counters and discrete states, Generalized Constant Probability (GCP) programs. We define the operational semantics of GCP in terms of the runs of a type of probabilistic pushdown automata (pPDAs). We characterize the resulting (sub-)probability generating function (pgf) $\Delta(z)$ as an algebraic function, representable via the roots of a kernel polynomial associated with the program. Next, we provide algorithms that, leveraging this information, compute under mild algebraic conditions the dominant singularities and the exact radius of convergence of $\Delta(z)$, leading to an exact asymptotic expansion and to exponential bounds for its coefficients. Our approach is sound for GCP programs and complete for the single-state subclass.

0

cs.FL 2026-07-03

Overlap-free morphisms are k-power-free for all k at least 3

by Francis Wlazinski (UPJV)

An overlap-free morphism is a k-power-free morphism for any integer k ge 3

The overlap-free condition on morphisms forces avoidance of cubes and every higher repetition as well.

abstract click to expand

We show that any overlap-free morphism is a $k$-power-free morphism for all integers $k\geq 3$.

0

cs.DS 2026-07-03

Algorithm matches unordered tree patterns in O(N max{nD^{3/2},S}) time

by Shintaro Matsushita, Takayoshi Shoudai +1 more

Efficient Pattern Matching in Unordered Term Tree Patterns with Height Constraints

Height-constrained variables let the method handle trees where child order does not matter, such as syntax trees and chemical structures.

abstract click to expand

Unordered trees appear in applications where the order among child vertices is insignificant, such as abstract syntax trees and chemical structures. To describe patterns in such trees, we propose unordered term tree patterns, which employ height-constrained variables that restrict trunk length and subtree height. We formalize the pattern matching problem between an unordered term tree pattern and an unordered tree, and present an $O(N \cdot \max\{nD^{3/2}, \mathcal{S}\})$-time algorithm, where $n$ and $N$ are the numbers of vertices in the pattern and tree, $D$ is the maximum vertex degree, and $\mathcal{S}$ is the sum of trunk constraints. Computational results show that the algorithm runs efficiently in practice.

0

cs.FL 2026-07-02

NFA acceptance reduces randomly to XNFA under polynomial ambiguity

by Dmitry Chistikov, Rados{l}aw Piórkowski +2 more

Algorithms and fine-grained complexity for nondeterministic and symmetric difference automata

Bounded ambiguity yields faster algorithms for acceptance, emptiness and equivalence in both models.

abstract click to expand

Symmetric difference automata (XNFA) are a variant of standard finite automata in which an input word is accepted iff the number of accepting runs is odd. Equivalently, these are weighted automata over the two-element field. We study the fine-grained complexity of the basic decision problems for XNFA: acceptance, emptiness, and equivalence, aiming to optimise the degree of the polynomial in their running-time bounds. Under the assumption of polynomial ambiguity, we provide a randomised reduction of NFA acceptance to XNFA acceptance. For automata of bounded ambiguity (e.g., unambiguous automata), we show that acceptance for both NFA and XNFA can be decided faster than in the general case. Without ambiguity assumptions, we give faster algorithms for the verification of suitable certificates for (non)emptiness and (non)equivalence of XNFA. Several of our results extend to weighted automata over other semirings and fields.

0

quant-ph 2026-07-02

Quantum automata need linear memory where classical need exponential

by Shiroman Prakash

Robust Quantum Memory Advantage from Contextuality

Promise problems on exclusivity graphs force classical automata to χ(G) states but allow quantum ones to use ξ(G)+1 dimensions with constant

abstract click to expand

Quantum contextuality is widely recognized as an essential non-classical resource underlying quantum technology, yet illuminating the precise mechanisms through which it translates into unconditional computational advantages remains an ongoing challenge. We demonstrate an exponential, noise-resilient memory advantage for quantum finite automata arising from graph-theoretic approaches to contextuality. We define a promise problem on an exclusivity graph $G$ for which any classical deterministic automaton acts as a non-contextual hidden variable model requiring at least $N=\chi(G)$ states, where $\chi(G)$ is the graph's chromatic number. In contrast, by exploiting a structural phenomenon we term \textit{representational contextuality}, a QFA solves this task using a memory of dimension at most $d=\xi(G)+1$, where $\xi(G)$ is the graph's orthogonal rank. This separation scales exponentially ($d=\mathcal O(n)$ versus $N=2^{\Omega(n)}$) for Boolean-orthogonality graphs. Crucially, this memory advantage maintains an $\mathcal{O}(1)$ threshold against both depolarizing and coherent noise.

0

cs.FL 2026-07-01

Unary three-way 2D automata make universality coNP-hard

by Taylor J. Smith

Complexity of Universality and Related Decision Problems for Unary Two-Dimensional Automata

Two-way deterministic versions place the same problems in P while nondeterministic versions remain in ELEMENTARY.

abstract click to expand

A two-dimensional automaton is able to move its input head through its input word in four directions: upward, downward, leftward, and rightward. If we prevent the input head from moving upward, then we obtain a three-way two-dimensional automaton; preventing both upward and leftward movements results in a two-way two-dimensional automaton. While much is known about the decidability and complexity properties of the two-dimensional automaton model, the unary variant of this model is less studied. We show that the universality, equivalence, and inclusion problems for unary three-way deterministic two-dimensional automata are coNP-hard, while for the corresponding two-way model, the universality, equivalence, inclusion, and disjointness problems are in P. We further show that the universality, equivalence, and inclusion problems for unary two-way nondeterministic two-dimensional automata are coNP-hard and in ELEMENTARY; and the disjointness problem for the same model is NL-hard and in ELEMENTARY. Finally, we establish the decidability of a bounded variant of the universality problem for unary three-way nondeterministic two-dimensional automata, and show that this variant problem is coNP-complete.

0

cs.FL 2026-06-30

DLSL phase expansions equal fiber-linear graph-respecting transducers

by Reda Belaiche

Destination-Labeled Self-Looping Systems with Dwell: Intrinsic Characterization, Realization Cost, and Recognition

Visible transduction fixes dwells and decisions; realizations need exactly sum of dwell values in states.

abstract click to expand

We study a finite-state symbolic controller for systems in which the admissible visible transitions are fixed in advance and each visible state carries a minimum dwell requirement. The resulting model, which we call a destination-labeled self-looping system with dwell (DLSL system), records the visible graph together with local decision maps; dwell memory appears only after phase expansion. The main structural issue is that, once dwell is imposed, the current visible state no longer determines whether a departure is allowed. This leads to the converse problem: which deterministic transducers arise as phase-expanded realizations of DLSL systems over a fixed visible graph? We show that the answer is exactly the class of fiber-linear graph-respecting transducers. Under natural reachability and realizable-departure assumptions, equivalent accessible realizations over the same visible graph are isomorphic; in particular, the visible transduction determines the dwell vector and the local decision maps. We also prove that any graph-preserving deterministic realization enforcing dwell values $(d_i)$ requires exactly $\sum_i d_i$ control states. Finally, we give an $O(|Q||\Omega|)$ recognition and reconstruction procedure, and extend the analysis to an edge-entry variant in which transitions may enter interior phases of successor fibers.

0

cs.RO 2026-06-30

Control inputs for linear systems are synthesized to meet Time Window Temporal Logic…

by Philip Smith, Ahmad Ahmad +1 more

Robustness-Based Synthesis for Time Window Temporal Logic Specifications via Mixed-Integer Linear Programming

Encoding robust satisfaction as mixed-integer constraints yields solutions that guarantee Boolean satisfaction when the objective is positiv

abstract click to expand

Time Window Temporal Logic (TWTL) is a rich specification language for cyber-physical systems that can compactly express sequential tasks with explicit timing constraints. In this paper, we consider the problem of synthesizing control inputs for discrete-time linear systems subject to TWTL task specifications. Building on the quantitative semantics (robustness) recently introduced for TWTL in [1], we encode the robust satisfaction of a TWTL formula as a set of Mixed-Integer Linear constraints and pose synthesis as a Mixed Integer Linear Program (MILP) that maximizes the robustness degree. We prove that any feasible solution with positive objective value guarantees Boolean satisfaction of the specification. We address two synthesis settings: an \emph{open-loop} formulation that optimizes the full control sequence from the initial state, and a \emph{closed-loop} receding-horizon Model Predictive Controller (MPC) formulation that re-solves the MILP at each step using the current measured state. A key feature of our MPC formulation is a \emph{task-adaptive horizon} that exploits the TWTL Deterministic Finite Automaton (DFA) to determine the active sub-task at each step, limiting the prediction horizon to the remaining window of the current task rather than the full formula horizon, this makes each re-solve significantly cheaper than the initial open-loop solve.

0

math.DS 2026-06-30

Unimodular Pisot numerations yield torus groups

by Olivier Carton, Jake Sudbery +1 more

From some Pisot numerations to topological groups

Z_U, the p-adic-style group for zero-preserving systems, is continuously isomorphic to a torus precisely when the system is unimodular.

abstract click to expand

A Pisot numeration system $U$ for $\mathbb N$ is a sequence of natural numbers generated by an integral homogeneous linear recurrence whose characteristic polynomial is the minimal polynomial of a Pisot number. The purpose of this paper is to introduce the analogue of the group of $p$-adic integers for such numerations when they \emph{preserve zeros}, which is equivalent to the `Condition F' introduced by Frougny and Solomyak for $\beta$-numerations. We show that these topological groups $\mathbb Z_U$ project homomorphically onto a torus. Equipping $\mathbb Z_U$ with the appropriate topology, we also show that if $U$ is unimodular, then $\mathbb Z_U$ is continuously isomorphic to a torus.

0

cs.LO 2026-06-30

LTL ∩ PCTL is decidable via DBW equivalence

by Massimo Benerecetti, Dario Della Monica +3 more

Deciding the Common Fragment of CTL with Past and LTL

An LTL formula lies in PCTL exactly when it is recognized by a deterministic Büchi word automaton, making the common fragment decidable.

abstract click to expand

A central goal of language theory is to compare formalisms by understanding their relative expressive power. One challenging question in this direction is the problem of determining the \emph{common fragment} of two formalisms $F_1$ and $F_2$, that is, effectively characterise the class $F_1\cap F_2$ of properties that can be expressed in both formalisms. A question closely related to this is the \emph{membership problem}, denoted $F_1 \membership F_2$, which asks whether a property expressed in $F_1$ can be also expressed in $F_2$. These problems become particularly difficult when \emph{branching-time} formalisms are involved. In this work, we prove that $\LTL \cap \PCTL$ is decidable, where \PCTL denotes \CTL extended with \emph{past operators}. We do this by showing that both membership problems, $\LTL \membership \PCTL$ and $\PCTL \membership \LTL$, are decidable. The direction $\PCTL \membership \LTL$ follows from suitable combinations of known results. The converse direction, $\LTL \membership \PCTL$, requires an automata-theoretic characterisation of $\PCTL$. Specifically, we introduce a new class of automata, called \emph{counter-free hesitant weak tree automata} ($\HWTcf$) that capture precisely the expressiveness of $\PCTL$, and that are obtained by combining two orthogonal restrictions on alternating parity tree automata, namely, \emph{counter-free hesitancy} and \emph{weakness}. We prove that, for every word language $L$ defined by an \LTL formula, the associated tree language $\triangle[L]$ is recognisable by an \HWTcf if and only if $L$ is recognized by a \DBW. Since the latter recognisability problem is decidable, so is the former. This result advances the longstanding open problem of deciding $\LTL \cap \CTL$. Indeed, that problem can now be reduced to $\PCTL \membership \CTL$, that is, the question of when past operators can be eliminated.

0

cs.FL 2026-06-30

CVASS reachability AC1 in 1D

by Michal Ajdarów, A. R. Balasubramanian +1 more

Reachability in Fixed-Dimensional Continuous VASS

All eight variants show the split, with hardness even on acyclic automata.

abstract click to expand

Vector Addition System with States (VASS) are a ubiquitous model of infinite-state systems consisting of a set of non-negative counters which can be incremented and decremented. It is known that the reachability problem for VASS is Ackermann-complete. Because of this huge complexity, various over-approximations of VASS have been studied in the literature. One such over-approximation is continuous VASS (CVASS), in which the counters are (non-negative) rational numbers and whenever a vector is added to the current counter values, it is first scaled with an arbitrarily chosen rational factor between zero and one. It is known that the reachability problem for CVASS is $\mathsf{NP}$-complete. In this paper, we initiate the study of fixed-dimensional CVASS, i.e., CVASS with a fixed number of counters. We study both the reachability and coverability problems, under both unary and binary encodings as well as over both the non-negative and the rational semantics. This gives rise to a collection of eight different problems. As our main result, we prove a complexity dichotomy for all of these eight problems when the transition vectors are over the rationals: For dimension 1, all of the eight problems are in $\mathsf{AC}^1$, whereas for any dimension at least 2, all of the eight problems are $\mathsf{NP}$-complete. Furthermore, the hardness holds even when the underlying automaton is acyclic. To achieve this result, we present a new technique called the Egyptian prime fractions technique. Finally, we also study these problems when the transition vectors are over the integers. Except for dimension 2, we classify the complexity of these problems over the non-negative semantics: For dimension 1, all of the problems are in $\mathsf{AC}^1$, whereas for dimensions 3 and above, all of the problems are $\mathsf{NP}$-complete.

0

cs.FL 2026-06-30

Transducers keep recurrence and morphic traits of infinite words

by Valérie Berthé, Herman Goulet-Ouellet +3 more

Preservation Theorems for Transducer Outputs

Deterministic finite-state machines map words with these combinatorial properties to outputs that retain them.

abstract click to expand

Suppose we have a deterministic finite-state transducer $A$ and an infinite word $x$, and run $A$ on $x$ to obtain an infinite word $A(x)$. Which properties of $x$ are guaranteed to also hold for $A(x)$? In this paper, we study this preservation question for various well-known combinatorial properties, e.g., recurrence, being morphic, and having factor frequencies. The celebrated Krohn-Rhodes theorem provides the framework for proving our preservation results, and our techniques are based on the ergodic theory of symbolic dynamical systems, i.e., shift spaces.

0

cs.LO 2026-06-30

Kleene theorem extends to many-sorted algebras

by Lü Gong, Raúl Ruiz Mora +2 more

A Kleene theorem for free many-sorted algebras

A language of one sort is recognizable precisely when regular, given finitary assumptions on the free algebra.

abstract click to expand

In this work, we generalize Kleene's theorem from free single-sorted algebras to free many-sorted algebras. Our main result establishes that, under appropriate finitary assumptions, a language of a given sort in a free many-sorted algebra is recognizable if and only if it is regular.

0

math.PR 2026-06-29

Rational function in three variables has no finite-automata factory

by Renato Paes Leme, Jon Schneider

Note on Finite-Automata Bernoulli Factories for Rational Functions

Mossel-Peres equivalence fails for multiple variables; a concrete counterexample admits a general factory but not a finite one.

abstract click to expand

Mossel and Peres (2005) established a comprehensive framework for designing Bernoulli factories. Notably, they demonstrated that a single-variable function admits a finite-automata Bernoulli factory if and only if it is a rational function. Their Theorem 2.9 claims an extension of this result to multivariable functions, but it contains a subtle technical oversight in the application of P\'olya's Theorem. We provide a direct counterexample: a rational function in three variables that admits a general Bernoulli factory but cannot be implemented by a finite-automata Bernoulli factory.

0

cs.FL 2026-06-29

Wheeler transducers admit Myhill-Nerode characterization

by Giovanna D'Agostino, Andrea Paradiso

Finite-State Transducers in the Wheeler Setting

Monotonic outputs plus Wheeler ordering yield closure under composition and an exact syntactic test for realizable functions.

abstract click to expand

Finite-state transducers and Wheeler automata are two well-established frameworks in formal language theory. While transducers extend finite-state automata by associating output words to input words, Wheeler automata are automata whose underlying graph admits a co-lexicographic sorting of states, giving rise to the class of Wheeler languages, a proper subclass of star-free regular languages with efficient indexing properties. In this work, we introduce the notion of sequential Wheeler transducers, a class of deterministic one-way transducers combining the Wheeler condition on the underlying automaton with a monotonicity requirement on the output function. We establish several fundamental properties of this class: closure under composition, and closure of Wheeler languages under inverse image of Wheeler transductions. We then develop a minimization theory by refining Choffrut's syntactic equivalence $\sim_f$ into a relation $\sim_f^c$, and prove a Myhill-Nerode-style theorem characterizing exactly the functions realizable by a sequential Wheeler transducer. Finally, we give a machine-independent characterization of Wheeler functions in terms of the behavior of the function. These results lay the groundwork for a broader structural theory of Wheeler transducers, and we outline open problems concerning decidability, complexity, non-deterministic extensions, and logical characterizations.

0

cs.LO 2026-06-29

Round-based algorithms reduce to finite-counter LTL checking

by Nathalie Bertrand, Pranav Ghorpade +1 more

Parameterized Verification of Asynchronous Round-Based Distributed Algorithms via Reduction to Finite-Counter Systems

Sound complete reduction lets existing model checkers verify any number of processes with unbounded rounds.

abstract click to expand

Traditional model-checking techniques typically verify distributed algorithms only for a fixed number of finite-state processes. Parameterized model checking generalizes this to any number of processes, while still typically assuming that each process is finite-state. In this work, we consider asynchronous round-based distributed algorithms in which each process is infinite-state since it can execute for an infinite number of rounds. We show that the parameterized verification problem for asynchronous round-based distributed algorithms is undecidable, already for simple specifications. Nevertheless, as our main contribution, we provide a reduction to LTL model checking over finite-counter systems and prove that it is sound and complete. This enables the use of off-the-shelf, mature symbolic model checkers for finite-counter systems. We demonstrate the practical applicability of this reduction by verifying safety and liveness properties of several asynchronous round-based consensus and leader-election algorithms using the nuXmv model checker.

0

cs.LO 2026-06-26

On the Continuity of the Probabilistic Bisimilarity Distance

by Syyeda Zainab Fatmi, Stefan Kiefer +2 more

States keep small distance under any small probability change exactly when they are robustly bisimilar, with a fast algorithm to check it.

abstract click to expand

The probabilistic bisimilarity distance provides a quantitative measure of behavioural difference for labelled Markov chains, but it may be discontinuous under perturbations of the transition probabilities. This lack of continuity undermines its applicability to empirically derived models, where transition probabilities are often approximations. Recently, we (CAV 2025) introduced robust probabilistic bisimilarity as a sufficient condition for continuity at distance zero. In this paper, we show that it is also a necessary condition, that is, two states are robustly probabilistic bisimilar if and only if their probabilistic bisimilarity distance is small for any small enough perturbation of the transition probabilities. We further extend robustness to non-bisimilar state pairs to establish a complete characterization for continuity of the probabilistic bisimilarity distance. Based on this characterization, we develop a polynomial time algorithm to decide continuity. Finally, we complement our theoretical contributions with an experimental evaluation demonstrating the proposed approach in practice. Our results show that the extra step of deciding continuity requires minimal additional cost when compared to computing the probabilistic bisimilarity distance.

0

cs.LO 2026-06-26

Fixed phases make tree safety checking EXPSPACE-complete

by Romain Delpy, Anca Muscholl +1 more

On Parameterized Verification Over Tree Topologies

Alternation bounds between upward and downward synchronizations decide safety on arbitrary-depth trees.

abstract click to expand

Parameterized verification of finite-state processes with rendez-vous synchronization is notoriously undecidable when processes are linearly ordered. In this paper we study two kinds of bounds under which we determine the complexity of safety checking over tree topologies. When bounding the depth we obtain that the complexity is related to the fast growing hierarchy. Our second bound limits the alternations between upwards and downwards synchronizations in the tree (phases), and occurs naturally in many concrete settings. If we fix the number of phases then the complexity of safety checking is EXPSPACE complete, and if the number of phases is part of the input it is 2EXPSPACE complete (both for arbitrary depth).

0

cs.LO 2026-06-26

Forward-only construction yields semilinear invariants for VAS

by Clotilde Bizière, Jérôme Leroux +1 more

A Forward-Only Construction of Semilinear Inductive Invariants for VAS

Builds from source alone without backward steps to create structure-aligned invariants, including periodic ones for periodic systems.

abstract click to expand

The reachability problem for Vector Addition Systems (VAS) is a central decision problem in the theory of infinite-state systems, first solved by Kosaraju and Mayr in the 1980s. An alternative, conceptually simpler approach introduced by Leroux shows that non-reachability is always witnessed by semilinear inductive invariants, yielding a decision procedure by combining an enumeration of runs with a search for such invariants. However, the construction of these invariants relies on a back-and-forth scheme that depends symmetrically on the source and the target. As a result, the invariants are not guaranteed to reflect the structural properties of the VAS, and the construction is difficult to extend to asymmetric models such as Branching VAS. We introduce a new forward-only construction of semilinear inductive invariants for VAS. Our method builds invariants from the source configuration alone and avoids the need for backward reasoning. This yields invariants that are more canonical and better aligned with the structure of the system. In particular, our method produces periodic inductive invariants for periodic VAS. Beyond its intrinsic interest, our approach provides a step toward extending invariant-based techniques to Branching VAS.

0

cs.FL 2026-06-26

Emerson-Lei elevator automata get improved complementation

by Ondrej Alexaj, Vojtěch Havlena +3 more

Complementing Emerson-Lei Elevator Automata (Technical Report)

Generalizing the elevator structure from Buchi cases produces an algorithm with better asymptotic complexity than general Emerson-Lei method

abstract click to expand

B\"uchi elevator automata naturally appear in several areas of formal methods as a structural expressibly-equivalent subclass of B\"uchi automata where every strongly connected component is either deterministic or inherently weak. It was shown that this class contains the majority of B\"uchi automata generated in practical applications, including LTL model-checking and verification of hyperproperties. Moreover, the elevator subclass enables more efficient complementation and determinization algorithms than unrestricted B\"uchi automata. In this paper, we introduce Emerson-Lei elevator automata, which is a generalization of B\"uchi elevator automata to richer acceptance conditions. We provide a complementation algorithm with a significantly better asymptotic complexity than the best known algorithm for unrestricted Emerson-Lei automata. The practical efficiency of our algorithm is demonstrated by an experimental comparison with the popular state-of-the-art tool Spot. Our work is, to the best of our knowledge, the first step towards practical algorithms for complementing, determinizing, and testing universality and inclusion of Emerson-Lei automata with rich acceptance conditions.

0

cs.FL 2026-06-26

Observer order forms diamond lattice of languages with P=NP collapse

by Fabio F.G. Buono

Observers, Symmetries, and the Hierarchy of Language Classes: A Theory of Computation Parameterized by the Observer

The profile observer makes deterministic and nondeterministic polynomial classes coincide inside standard P, separating structural blindness

abstract click to expand

We introduce the \emph{observational hierarchy}, a new axis of classification for formal languages, orthogonal to the Chomsky hierarchy. An observer is a function $O : \Sigma^* \to S$ that determines which information about the input is accessible to a computational system. The order-blind automaton, which perceives the input as a multiset of symbols rather than a sequence, constitutes the paradigmatic case. We prove that the class of languages recognisable by any machine equipped with such an observer coincides exactly with the permutation-closed languages. We then define a partial order on observers that induces a hierarchy of language classes parametrised not by the computational power of the machine, but by the structure of the observer. We prove that this hierarchy has the structure of a partial order with a diamond-shaped profile sub-lattice, comprising the length branch $O_\bot \prec O_{\mathrm{len}} \prec O_{\mathrm{prof}} \prec O_\top$ and the parity branch $O_\bot \prec O_{\mathrm{par}} \prec O_{\mathrm{prof}} \prec O_\top$, with $O_{\mathrm{len}}$ and $O_{\mathrm{par}}$ incomparable, and an infinite subsequence branch $O_\bot \prec O_1 \prec O_2 \prec \cdots \prec O_\top$, both converging to the complete observer. We prove that the observational hierarchy is strictly incomparable with the Chomsky hierarchy, and introduce the notion of \emph{observational complexity} of a language. We further define observer-parametrised complexity classes $\mathbf{P}_O$ and $\mathbf{NP}_O$, and show that computational hardness and structural blindness are two independent phenomena. In particular, $\mathbf{P}_{O_{\mathrm{prof}}} = \mathbf{NP}_{O_{\mathrm{prof}}}$ holds as a structural collapse strictly inside $\mathbf{P}$.

0

cs.FL 2026-06-26

Size and ceiling bounds reshape nonuniform automata power

by Tomoyuki Yamakami (University of Fukui)

How Can Size and Ceiling Bounds Affect the Complexity of Nonuniform Automata Families?

Varying state counts and input-length limits changes what languages the automata families capture relative to advised space classes.

abstract click to expand

In the past literature, families of two-way finite automata and pushdown automata having limited state complexity (i.e., the total number of inner states) and stack-state complexity (i.e., the total number of inner states multiplied by the total number of strings "pushable" to a stack), have been studied in direct connection to (mainstream) space-bounded complexity classes equipped with Karp-Lipton style advice of limited size when all inputs given to the automata have bounded length. Here, we acknowledge two major factors -- size and ceiling -- of such families, which have a significant impact on the complexity of finite and pushdown automata families, where the "size" refers to (stack-)state complexity and the "ceiling" refers to an input's length bound. In this line of study, we further explore those effects caused by different sizes and ceilings.

0

cs.FL 2026-06-26

Non-forgetting PCFA match multi-head automata except one case

by Jana Schulz (University of Potsdam)

Parallel Communicating Finite Automata: The Non-Forgetting Model

Only deterministic centralized non-returning mode has strict inclusion, completing prior proofs.

abstract click to expand

Parallel Communicating Finite Automata (PCFA) are systems of several finite automata that can communicate by requesting the state of another automaton. As an attempt to make PCFA as defined in [8] and [2] more realistic, the non-forgetting model is introduced where automata retain their own states while querying. As in previous publications, several variants of these automata systems are considered. The computational capacity of the non-forgetting model is investigated and compared to ''forgetting'' systems and multi-head finite automata. It is shown that most variants of non-forgetting PCFA are as powerful as multi-head automata in the deterministic and nondeterministic cases where the number of automata equals the number of heads. The only exception is the special case of deterministic centralized systems in non-returning mode. Here, a strict inclusion is proved. In the course of this proving, a proof from [2] is completed. With that, this paper answers some questions for nfPCFA that are open for the classical ''forgetting'' case.

0

cs.FL 2026-06-26

Idefix-free selectors add new families to external contextual grammar hierarchies

by Marvin Ködding (PH Heidelberg), Bianca Truthe (Universität Giessen)

Idefix-Free Languages and Their Application in External Contextual Grammars

Prefix-, suffix- and infix-free selection languages are compared to other subregular classes and produce fresh entries in the known hierarch

abstract click to expand

In this paper, we continue the research on the power of contextual grammars with selection languages from subfamilies of the family of regular languages. We investigate infix-, prefix-, and suffix-free languages (referred to as idefix-free languages) and compare such language families to some other subregular families of languages (finite, monoidal, nilpotent, combinational, (symmetric) definite, ordered, non-counting, power-separating, commutative, circular, union-free, star, and comet languages). Further, we compare the families of the hierarchies obtained for external contextual grammars with the language families defined by these new types for the selection. In this way, we extend the existing hierarchies by new language families.

0

cs.FL 2026-06-26

Two-head automata place picture languages between RFA and RPDA

by Henning Fernau (Universität Trier, Germany) +6 more

2-Head 2D Returning Finite Automata

The new model is incomparable with context-free matrix grammars yet properly contained in returning pushdown automata, with a synchronized v

abstract click to expand

We introduce and study a family of two-head finite automata called two head returning finite automata (2-HRFA) operating on rectangular arrays of picture languages, in which both heads move in opposite directions. We show that the class of picture languages accepted by 2-HRFA is incomparable with the class of languages generated by context-free matrix grammars (CFMG), while it forms a proper subset of the class of languages accepted by returning pushdown automata (RPDA). In addition, we define a constrained variant, both head stepping two head returning finite automata (B2-HRFA), in which both heads are required to move in a synchronized, stepwise fashion. We prove that the class of languages accepted by returning finite automata (RFA) is a proper subset of the class of languages accepted by B2-HRFA, which in turn is a proper subset of the class of languages accepted by 2-HRFA. Closure properties for both the families of languages accepted by 2-HRFA and B2-HRFA are also investigated.

0

cs.FL 2026-06-26

Sparse counters cut replication in c-regex matching

by Martin Berglund (Ume{aa} University), Brink van der Merwe (Stellenbosch University) +1 more

Efficient Regex Matching with Sparse Counting-Sets

By keeping only essential counter values, the method reduces copying on branches while preserving exact match results.

abstract click to expand

Regular expressions with counting operations (c-regexes) offer a compact representation of repeating patterns by allowing numerical bounds to be added to subexpressions. Recent work introduced the counting-set data structure, which allows simultaneous updates of multiple counter values for efficient matching. However, this approach suffers from a performance bottleneck when counting-sets must be replicated due to the presence of branching transitions. We propose a sparse counting-set approach, which reduces the replication overhead by maintaining only essential counter values, thereby yielding a more efficient matching algorithm.

0

cs.FL 2026-06-26

Minimum feedback vertex sets turn regex backtracking linear

by Martin Berglund (Ume{aa} University), Brink van der Merwe (Stellenbosch University) +1 more

Selective Memoization for Efficient Backtracking Regular Expression Matching

Memoizing only the smallest cycle-breaking node set removes exponential paths while using far less memory than full memoization.

abstract click to expand

Backtracking regular expression matchers are widely used due to their expressive power but may exhibit exponential worst-case matching time. Memoization provides a principled method for eliminating redundant computation and ensuring linear matching time, but full memoization is memory-intensive and impractical. We introduce the Minimum Feedback Node (MFN) memoization scheme, a selective memoization strategy based on computing a minimum feedback vertex set of an automaton. We establish relationships with existing memoization schemes and analyze their behaviour under both Thompson and Glushkov automaton constructions.

0

cs.FL 2026-06-26

Factorial bound limits Langton's ant in squares

by Kossi Roland Etse

How Long Can the Escaping Ant Be Confined?

Linear upper bounds proven tight for height two and three rectangles via column analysis.

abstract click to expand

Langton's ant is a simple two-dimensional cellular automaton whose long-term behavior exhibits remarkable complexity. While it is known that the ant eventually escapes any finite connected region of the grid, the quantitative aspects of this escape remain poorly understood. In this paper, we study the escaping time of Langton's ant, defined as the maximum number of steps the ant can perform within a finite connected domain before leaving it. We establish general upper bounds on the escaping time as a function of the domain size, and derive improved bounds for rectangular domains. In particular, we obtain a factorial upper bound for square domains via an inductive decomposition argument. We also obtain linear upper bounds for rectangular domains of height two and three via a column-by-column analysis. More generally, for rectangular domains with a fixed height, we establish a polynomial upper bound in the number of columns. These results are complemented by exact values computed through an optimized simulation algorithm that exploits the geometric symmetries of the grid and employs a backtracking branching strategy to avoid exhaustive search over all color configurations. We also provide lower-bound constructions, proving that the linear upper bounds for rectangular domains of heights two and three are asymptotically optimal.

0

cs.FL 2026-06-26

Order-2 bygone-state opacity decided in double exponential time

by Kuize Zhang

Order-2 bygone-state opacity of labeled finite-state automata

Concurrent composition with the classical observer decides whether one agent can be sure about another's state knowledge.

abstract click to expand

In this paper, we formulate a scenario that an agent can never be sure that another agent can uniquely determine the state of a finite-state automaton based on its observations to the automaton at the current and any past time as the property of order-2 bygone-state opacity. Based on our concurrent composition and the classical observer, we derive a tool to verify this property in doubly exponential time. The interest of this result lies in that we extend inference of finite automata from a single agent to two ordered agents.

0

cs.FL 2026-06-25

Unary tropical automata determinisation is coNP-complete

by Shaull Almagor, Ismaël Jecker +4 more

Representing One Letter Weighted Automata Over the Tropical Semiring

Quadratic conversion to unions of deterministic automata proves hardness for determinisation and register minimisation.

abstract click to expand

We consider weighted automata over the tropical semiring $\mathbb{Z}_\infty(min, +)$. Recently, it was shown that determinisation is decidable; in this paper we focus on the complexity when the alphabet is unary. In 2001, Lombardy showed this problem is decidable, a close inspection of his proof yields a coNP upper bound on the complexity. Earlier Gaubert showed that every weighted automaton in this setting can be effectively turned into an equivalent union of deterministic weighted automata. We prove Gaubert's result efficiently, presenting it as a generalisation of Chrobak's normal form for unary NFA. In particular, we prove that the equivalent union of deterministic weighted automata can be represented by a weighted automaton of quadratic size in the size of the original one, and this representation can be computed in polynomial time. Building on this, we show that determinisation, and even register minimisation (which generalises determinisation), is coNP-complete. We complete the paper with observations that the boundedness problem is also coNP-complete by reductions with determinisation. Lastly, we provide evidence that all of these problems are not FPT (by proving $coW_1$-hardness) when parametrised by the number of deterministic automata in the union.

0

cs.FL 2026-06-25

Discounted sums cannot be exactly monitored with finite memory

by Filip Cano, Thomas A. Henzinger +2 more

Monitoring Discounted Sum Properties

Runtime verification must use ε-approximate monitors whose memory and observation needs are bounded explicitly in terms of the allowed error

abstract click to expand

Runtime monitoring of quantitative signals faces a fundamental trade-off between volatility and over-aggregation: instantaneous observations are noisy, while long-run averages obscure local structure. Localisation measures such as discounted averages offer a principled middle ground, yet remain poorly understood in runtime verification. This paper studies discounted sums from a monitoring perspective, in both deterministic and stochastic settings. We formalize the discounted monitoring problem and show that exact, sound monitoring of discounted sums cannot be achieved with finite memory. To overcome this impossibility, we introduce $\varepsilon$-approximately sound monitoring, deriving explicit bounds on memory and observation requirements. We then extend the framework to stochastic processes via expected discounted sums, defining pointwise and uniform $(\varepsilon,\delta)$-soundness notions, establishing statistical optimality, and proving impossibility beyond a precision threshold. We also formalize the resource complexity of deterministic discounted monitoring via affine register machines and prove a tight worst-case lower bound. Finally, we present a specification language for arithmetic expressions over multiple discounted sums with synchronous and asynchronous semantics, and evaluate our approach on practical scenarios including algorithmic fairness.

0

cs.FL 2026-06-25

NCMA 2026 workshop held in Rouen on June 29-30

by Cezar Câmpeanu, Martin Kutrib +1 more

Proceedings of the 16th International Workshop on Non-Classical Models of Automata and Applications

The sixteenth edition of the series on non-classical automata models took place in France under University of Rouen organization.

abstract click to expand

The Sixteenth International Workshop on Non-Classical Models of Automata and Applications (NCMA 2026) was held in Rouen, France, on June 29 and 30, 2026, organized by the University of Rouen. The NCMA workshop series was established in 2009 as an annual event for researchers working on non-classical and classical models of automata, grammars or related devices. Such models are investigated both as theoretical models and as formal models for applications from different points of view.

0

cs.LO 2026-06-25

Unifilarisation arises as coalgebraic determinisation

by Manuel Baltieri, Nathaniel Virgo

Bayesian updates from coalgebraic determinisation

Bayesian update transitions and causal output distributions emerge from the general construction on Mealy machines over monads with support-

abstract click to expand

The powerset construction is the classical determinisation procedure for nondeterministic finite automata. In the coalgebraic setting, this construction has been generalised to structured coalgebras, which are coalgebras equipped with extra data. For stochastic Moore machines over the distribution monad, a type of structured coalgebra, the determinisation construction induces a semantics assigning to each finite input word a distribution on the current output. This semantics is appropriate when only the current output matters, but it is too coarse for settings in which intermediate observations must also be taken into account, as is typical for agents solving POMDPs in control theory and reinforcement learning. In these contexts, agents need to condition on all realised observations, not just the final one, so to better plan for the future. This has been addressed from a category theoretic perspective through a procedure called ``unifilarisation'', which (in our context) takes a stochastic Mealy machine and produces a machine whose states are priors over the original state space and whose transitions are given by Bayesian filtering. Here we show that unifilarisation is an instance of coalgebraic determinisation. We work with Mealy machines over monads equipped with extra structure generalising the notion of the support of a distribution. We show that in this setting, unifilarisation arises from the general determinisation procedure. We then compare the resulting final coalgebra semantics with the Moore-style one. Instead of assigning only a distribution on current outputs to each finite input word, it yields causal stochastic behaviours, that is, families mapping input words to distributions on output words compatible with the ``causality'' constraint that outputs cannot depend on future inputs.

0

cs.SE 2026-06-25

Formal grammars reach every BPM phase via seven separate streams

by Milliam Maxime Zekeng Ndadji

Formal Grammars in Business Process Management: A Systematic Literature Review

Review of 34 studies maps parallel applications from design to verification and lists five open challenges for integration.

abstract click to expand

Business Process Management (BPM) is concerned with the systematic design, execution, monitoring, and improvement of business processes. Formal grammars have emerged as a particularly fruitful formalism for BPM, offering generative, declarative, and analytical capabilities that are uniquely well-suited to process-oriented concerns. This paper presents a systematic literature review of 34 primary studies at the intersection of formal grammars and BPM. We identify seven research streams: (i) process grammars for organizational process design; (ii) process modeling languages evaluated as grammars under the Bunge-Wand-Weber ontological framework; (iii) production-rule grammars for process structural specification and variant management; (iv) attribute grammars for the declarative specification and distributed execution of workflows; (v) graph grammars for the transformation, generation, and semantic analysis of process models; (vi) grammatical inference for process mining and discovery; and (vii) process algebras as grammar-like compositional frameworks for behavioral specification and verification. For each stream, we synthesize contributions, formalisms employed, and limitations. The review reveals that formal grammars have influenced BPM across every lifecycle phase (from organizational design to formal verification and data-driven discovery) yet the seven streams have developed largely in parallel, without cross-stream synthesis. We identify five corpus-grounded open challenges and argue that a deeper, unified exploitation of grammatical theory holds significant promise for advancing the state of the art in BPM.

0

cs.FL 2026-06-24

Regular languages gain O(log n) edit-stable local annotations

by Faruk Alpay, Baris Basaran

Exact Local Annotations for Regular Languages

Balanced product construction keeps locality constant and membership access constant-time after single-symbol changes.

abstract click to expand

A regular language is recognized by a finite monoid, but a locally checkable explanation of that recognition can have a nontrivial update geometry. We study exact bounded-arity annotations for regular word languages under one-symbol substitutions. The cost of an edit is the number of annotation cells that a canonical locally accepted representation must change, together with the corresponding bit movement and the number of local constraints that must be revalidated. For every morphism recognizing a regular language, the balanced product annotation gives constant locality, linear size, O(log n) edit stability, O(log n) revalidation, and constant access to the membership value. The matching lower bound proved here is restricted to product decompositions that expose an edit-active nontrivial group quotient as ordered product labels; in that setting one substitution changes every quotient label on an ancestor path. We also show that annotation-free bounded-window recognition is exactly strict locality, prove closure properties for a two-sided total decision variant, and formulate the remaining constant-stability boundary as a finite obstruction problem. The ancillary files include Lean, CP-SAT, and CUDA certificates, including a context-free interval-chart experiment.

0

cs.LO 2026-06-24

Linear algebra checks MJLS moment stability for chosen initial sets

by Lena Becker, Holger Hermanns

Stability Checking of Markov Jump Linear Systems via Probabilistic Temporal Logic (Extended Version)

PCTL extensions reduce these properties to matrix computations, avoiding a full decision procedure for the base logic on Markov jump systems

abstract click to expand

Markov jump linear systems (MJLSs) model dynamical phenomena subject to random switching among multiple linear modes, driven by an underlying Markov chain. Classical notions such as mean and mean-square stability characterize the long-term asymptotic behaviour of the first and second moments of an MJLS, but they can be overly conservative or even misleading when only a specific subset of initial conditions is of interest. We tackle this challenge through the lens of model checking, where reasoning about specific sets of initial conditions is intrinsic to the approach. We begin by formalizing probabilistic computation tree logic (PCTL) on MJLSs, enabling the specification of state-based temporal properties for these systems. Building on this foundation, we extend the logic to capture moment-based stability properties relative to a prescribed set of initial states. While we ultimately do not obtain a decision procedure for the entire base logic, the logical extensions can be handled -- albeit with some technical subtleties -- by exploiting linear-algebraic techniques.

0

cs.CC 2026-06-24

Matching bounds fix token cost to certify stochastic-oracle reliability

by Jie Wang

Token Complexity of Certifying Stochastic-Oracle Reliability

Upper and lower bounds agree in the small-error limit, giving the exact leading-order certification cost.

abstract click to expand

Wang~\cite{Wang2026} introduced the Stochastic-Oracle Turing Machine (SOTM) framework and defined token complexity as the minimum expected cost of interacting with a stochastic oracle needed to attain a specified solution quality for a task. This paper develops an analogous notion for certifying the reliability of a stochastic oracle on a given domain. Certification token complexity is the minimum expected token cost required, with controlled error probability, to distinguish oracles that meet a target reliability level from those that fall below a lower reliability threshold. We construct an SPRT-based certification SOTM that queries the oracle, computes binary correctness scores, and stops when the accumulated log-likelihood evidence crosses a decision threshold. The SOTM halts almost surely, satisfies the desired two-sided error guarantee over the reliability regions to be certified, and yields an explicit upper bound on certification token complexity in terms of the reliability thresholds, the error bound, and the expected per-turn token cost. We then establish a matching information-theoretic lower bound: even with adaptive queries, every error-bounded certification SOTM must incur the same leading-order expected token cost as the SPRT-based construction as the prescribed error bound tends to zero. Together, these bounds characterize the leading-order certification token complexity in the small-error regime.

0

cs.SE 2026-06-22

Pipeline turns LLM Java into formally verified code

by Ran Wei, Le Zhu +5 more

Formal-Method-Guided Vibe Coding: Closing the Verification Loop on AI-Generated Safety-Critical Software Through Model-Driven Engineering

Verification failures from Dafny, CSP and Z feed back as prompts until evidence for certification standards appears

abstract click to expand

Vibe coding -- accepting LLM-generated source from natural-language intent with minimal review -- is fast and may be adequate for low-criticality consumer software. But for safety-critical systems governed by DO-178C, IEC 61508, or ISO 26262, it offers no path to certification: large language models (LLMs) provide no formal correctness guarantees, and existing remedies target verification-aware languages (Dafny, Verus, Lean) that are scarce in pretraining data and absent from industrial toolchains. This paper closes the gap. We present Forge (Formal method Oriented Refinement loop for GEnerated code): a closed-loop pipeline that guides vibe coding through formal verification using established Model-Driven Engineering (MDE) infrastructure. Through vibe coding, we generate Java source code; our pipeline then extracts -- via model transformations -- formal artefacts in three different formalisms, each checked by a complementary verifier: deductive verification (Dafny), Communicating Sequential Processes (CSP) refinement via the Failures-Divergences Refinement checker (FDR4), and theorem proving using Z-Machines in Isabelle; every verification failure becomes a structured correction prompt that drives the next code-generation iteration. The LLM is the draft generator, the MDE chain is the discriminator, and the developer never has to read the formal models. Empirically, we find that the pipeline produces standards-relevant verification evidence for LLM-generated Java -- a step toward certification.

0

cs.FL 2026-06-22

Polynomial algorithm computes DTMC probabilities for FDFA specs

by Christel Baier, Sascha Klüppelholz +1 more

Probabilistic Model Checking via Families of Deterministic and Unambiguous Finite Automata

Extends to unambiguous families and yields single-exponential LTL translation versus double-exponential for deterministic versions.

abstract click to expand

Families of deterministic finite automata (FDFA) have been introduced as a concise automaton model that characterizes $\omega$-regular languages by processing their ultimately periodic words. FDFA are known to enjoy many good properties and can be exponentially more succinct than deterministic $\omega$-automata with Rabin, Streett or parity acceptance. This paper addresses two main questions: (1) Are FDFA suitable for probabilistic model checking purposes? and (2) Is it possible to obtain an even more compact representation of $\omega$-regular languages by allowing the components of an FDFA to be unambiguous instead of deterministic? Question (1) is answered in the affirmative by presenting the first polynomial-time algorithm for computing the probability that a discrete-time Markov chain satisfies an $\omega$-regular property represented as an FDFA. Question (2) is motivated by the fact that unambiguous finite automata may require exponentially fewer states than deterministic ones. This paper introduces a model of families of unambiguous finite automata (FUFA) that captures the class of $\omega$-regular languages. FUFA can be exponentially more succinct than both FDFA and unambiguous B\"uchi automata, and there is a single-exponential translation from linear temporal logic (LTL) to FUFA. This stands in contrast to a double-exponential lower bound for the translation from LTL to FDFA. Moreover, the polynomial-time probabilistic model checking algorithm for discrete-time Markov chains against FDFA-specifications is extended to the case where the property is represented by an FUFA with a deterministic leading automaton.

0

cs.FL 2026-06-22

Abstractions speed parametric timed game synthesis

by Mikael Bisgaard Dahlsen-Jensen (1), Laure Petrucci (2) +6 more

State-Space Abstractions for Parametric Timed Games

A general framework lifts zone merging, hulls and similar reductions to games while proving they preserve correct parameters and strategies.

abstract click to expand

Synthesizing controllers for real-time systems under both timing uncertainty and adversarial environments requires exploring prohibitively large symbolic state spaces. While zone inclusion checking has been applied to Parametric Timed Games, more aggressive abstractions from the Parametric Timed Automata and Timed Games literature -- double inclusion, zone merging, hull abstractions, and location-based abstraction -- have not yet been lifted to the parametric game setting. We define a general abstraction framework for Parametric Timed Games, instantiate it with each of the aforementioned abstractions, and prove that the framework preserves correctness of parameter synthesis and winning strategies. Experimental results on an established production cell benchmark and a novel adversarial IoT case study show that the abstractions significantly improve scalability, solving instances previously intractable for existing techniques.

0

cs.FL 2026-06-19

Random labels make Moore automata minimal with high probability

by Matías Carrasco, Sergio Yovine

Minimality of Random Moore Automata under Prefix-Dependent Congruences

When label agreement probability is below one and each admits three symbols, the prefix-dependent equivalence collapses to the identity almo

abstract click to expand

We study prefix-dependent congruences for random deterministic transition systems with state outputs. In this setting, the admissible continuations used to compare two states may depend on the observed prefix, and two states are identified only if no common admissible continuation distinguishes their future outputs. The framework includes probabilistic deterministic finite automata as a motivating special case. We analyze the random transition model in which all transition values are independent and uniform. Each state is also assigned an independent label that specifies both its output and its set of admissible symbols. If two independent labels agree with probability strictly less than one, and every label has at least three admissible symbols, then the induced congruence is trivial with high probability. The proof combines a pruning process on pairs, a collision-free exploration controlling its early evolution, and a first-moment argument showing that the remaining pairs cannot organize into nontrivial equivalence classes.

0

cs.FL 2026-06-19

Alternating real-time automata learned with fewer states

by Kazuki Kinoshita, Masaki Waga

Learning Alternating Real-Time Automata

AL*RTA produces smaller models than nondeterministic learning but uses more queries.

abstract click to expand

We present the AL*RTA algorithm for learning alternating real-time automata (ARTAs) using membership and equivalence queries. AL*RTA combines ideas from AL*for learning alternating finite automata and NL*RTA for learning nondeterministic real-time automata. We first define ARTAs and show that alternation improves succinctness, although it does not increase expressive power. We then present AL*RTA and show its termination. Our empirical evaluation suggests that AL*RTA generally learns smaller automata than NL*RTA at the cost of more queries.

0

cs.DS 2026-06-18

Left-convex languages make gap-constrained subsequence matching linear-time

by Antoine Amarilli, Florin Manea +2 more

Tractable Gap-Constraint Languages for Complex Event Recognition

The property yields an O(|D|(|u| + |C|)) algorithm for arbitrary interval structures, optimal under SETH, while still covering length bounds

abstract click to expand

For strings $u, D \in \Sigma^*$, a subsequence embedding of $u$ in $D$ is a function $e \colon \{1, 2, \ldots, |u|\} \to \{1, 2, \ldots, |D|\}$ with $e(i) < e(i+1)$ for every $i \in \{1, 2, \ldots, |u|-1\}$ and the $i$-th symbol of $u$ equals the $e(i)$-th symbol of $D$. A gap-constraint for $u$ is a triple $(i, j, L)$ with $1 \leq i < j \leq |u|$ and $L$ is a regular language over $\Sigma$. An embedding $e$ satisfies a gap-constraint $(i, j, L)$ if the factor of $D$ strictly between positions $e(i)$ and $e(j)$ is a word from $L$. We investigate the subsequence matching problem with gap-constraints, which is relevant in the context of complex event recognition (CER): given $u, D \in \Sigma^*$ and a set $C$ of gap-constraints, find an embedding of $u$ in $D$ that satisfies all gap-constraints from $C$. In general, subsequence matching is NP-complete and the only known tractable variants restrict the interval structure of the gap-constraints. In this work, we show that we can solve subsequence matching with gap-constraints with an arbitrary interval structure rather efficiently (in fact, optimally under SETH) in time $O(|D| (|u| + |C|))$ if the gap-constraint languages satisfy a property which we dub left-convexity: whenever $u v w \in L$ and $v \in L$, then also $uv \in L$. Left-convex languages are sufficiently expressive to model interesting real-world scenarios considered in CER, e.g., length constraints $L = \{w \mid a \leq |w| \leq b\}$ for $a, b \in \mathbb{N}$. We also show how our algorithm can be used in order to efficiently enumerate all satisfying embeddings, which is particularly relevant for possible applications in CER. Finally, we show how non-left-convex languages can lead to intractability, i.e., if in addition to length constraints we allow $\{aa, \epsilon\}$ as the only non-left-convex constraint language, then the problem is NP-complete again.

0

cs.FL 2026-06-17

Hypergraph maps characterize epimorphisms of universal automata

by Jasem Hamoud

On Epimorphisms of Hypergraphic Automata and Input Symbol Semigroups

Necessary and sufficient conditions on state and output hypergraphs decide when a triple of maps is an epimorphism, with weak and strong not

abstract click to expand

Hypergraphic automata are automata whose state sets and output symbol sets are hypergraphs invariant under the actions of the transition and output functions. Universally attracting objects in the category of such automata are called universal hypergraphic automata; their semigroups of input symbols are algebras of mappings whose properties are tightly linked to the algebraic structure of the automata themselves. This paper establishes a complete characterisation of epimorphisms of universal hypergraphic automata and of their semigroups of input symbols. A central contribution is the introduction of two distinct notions of epimorphism for hypergraphs including weak, strong and the proof that these notions diverge in general but necessarily coincide for the important subclass of $p^*$-hypergraphs, which includes automata whose state hypergraphs and output hypergraphs are projective or affine planes. The main results give necessary and sufficient conditions for a triple $(f, \mathbb{P}_s, g)$ to be an epimorphism of universal hypergraphic automata, expressed in terms of the component maps on the state and output hypergraphs.

0

cs.NE 2026-06-15

Weight discretization cuts SNN verification state space 17-fold per neuron

by Nikan Zandian Jazi, Elisabetta De Maria +1 more

A Formal Tool for Verification of Probabilistic Spiking Neural Networks Based on Quotient Abstractions

Two theorems bound firing error and guarantee long-term silence, making formal checks feasible on networks that were previously intractable.

abstract click to expand

Spiking Neural Networks (SNNs) model biological neural dynamics more faithfully than classical artificial networks, but their stochastic, event-driven computation -- rooted in ion-channel noise and unreliable synaptic vesicle release -- demands probabilistic models for which deterministic abstractions are mathematically inadequate. Formal verification of such models via probabilistic model checking faces a fundamental barrier: the state space explosion problem, where the Discrete-Time Markov Chain (DTMC) encoding grows exponentially with the number of neurons. General-purpose quotient model abstractions [1] can in principle mitigate this growth by partitioning membrane potentials into equivalence classes, but a na\"ive application to SNNs discards synaptic weight information, limiting the properties that can be verified. This paper introduces a weight-discretized quotient model abstraction that maps continuous synaptic weights to a compact integer range while preserving the relative contribution of each synapse, and presents CogSpike, a unified workbench that integrates SNN design, simulation, and PRISM-based formal verification within a single isomorphic tool chain. The discretization is accompanied by formal correctness guarantees: a two-sided fidelity theorem confines any firing disagreement to a bounded gray zone around threshold, and an Asymptotic Silence theorem gives the exact limit guarantee that unforced neurons fall permanently silent. A topology-dependent scaling analysis shows that the state space reduction compounds exponentially -- approximately $17\times$ per neuron for discretization parameter $W = 3$ -- enabling verification of networks that are otherwise intractable, as confirmed empirically across seven canonical topologies.

0

cs.FL 2026-06-11

Online algorithm enforces safety in hybrid automata

by Mir Md Sajid Sarwar, Srinivas Pinisetty +2 more

Runtime Enforcement of Hybrid System Properties

Reachability analysis synthesizes event edits to prevent violations in systems with continuous dynamics, shown on adaptive cruise control.

abstract click to expand

Runtime enforcement has emerged as a promising approach for ensuring the safety of autonomous and cyber-physical systems operating in uncertain and dynamic environments. Unlike traditional runtime verification, runtime enforcement actively intervenes during execution to prevent property violations by modifying unsafe system behaviors. Existing enforcement frameworks primarily focus on untimed or discrete-time specifications and are often limited to delaying or suppressing events, making them inadequate for reactive systems exhibiting complex continuous dynamics. In this paper, we propose a runtime enforcement framework where safety requirements are modeled using Hybrid Automata (HA). The framework combines discrete-event editing with continuous-time monitoring to support enforcement actions such as suppression, delay, and insertion of events at arbitrary time instants. Upon observing environmental inputs, the automaton is initialized, and runtime reachability analysis is used to synthesize safe corrective actions. We formally define the enforcement problem for safety hybrid automata, establish enforceability conditions, and present an online enforcement algorithm for reactive systems. A detailed case study on an Adaptive Cruise Control (ACC) system demonstrates the effectiveness of the proposed approach in maintaining safety properties under unsafe controller behaviors. Experimental results show that the framework introduces minimal computational overhead while ensuring continuous compliance with safety requirements in real time.

0

cs.FL 2026-06-10

Normal timed automata bandwidth now computed from graph ratio

by Eugene Asarin, Aldric Degorre +2 more

Weighing Timed Regular Languages: The Final Step (long version)

Bandwidth equals α log(1/ε) where α is the max reward-to-cost ratio on a finite weighted graph derived from the automaton.

abstract click to expand

The bandwidth of a timed language characterizes the quantity of information per time unit (with a finite observation precision $\varepsilon$). The asymptotic behavior of the bandwidth as $\varepsilon \to 0$ classifies timed regular languages in three classes: meager, normal, and obese. Normal timed automata have a bounded frequency of events and some non-punctual transitions, and, up to now, were the only class of timed automata for which no algorithm was available for computing their bandwidth. In this article, we compute the bandwidth of any such automaton in the form $\approx\alpha\log{1/\varepsilon}$. Our approach reduces this problem to computing the best reward-to-cost ratio in a weighted finite graph constructed from the given timed automaton.

0

cs.FL 2026-06-10

ZPCP belongs to Σ⁰₂ but not to Π⁰₁ or Σ⁰₁

by Olivier Finkel, Vesa Halava

On the Complexity of the Bi-infinite Post Correspondence Problem

Reductions from Turing non-halting locate the bi-infinite PCP at the second level of the arithmetical hierarchy.

abstract click to expand

In the bi-infinite Post Correspondence Problem ($\Z$PCP), it is asked whether the same bi-infinite word can be constructed correspondingly from a given finite set of pairs of words. In this article, we study its complexity with respect to the arithmetical hierarchy and prove that it is in $\Si^0_2 \setminus (\Pi^0_1 \cup \Si^0_1)$ and, therefore, at the level 2 of the arithmetical hierarchy. For the proof, we present a sequence of reductions starting from the nonhalting of the Turing machine all the way to $\Z$PCP via infinite PCP, an $s$-shift infinite PCP and $s$-shift $\Z$PCP for all natural numbers $s$. In the process, we prove that the infinite PCP is undecidable for injective morphisms, and that the infinite injective PCP, $s$-shift infinite PCP, $s$-shift $\Z$PCP and the non-termination problem for (deterministic and reversible) semi-Thue systems are all $\Pi^0_1$-complete.

0

cs.DC 2026-06-10

LCL gaps extend to unbounded degree trees via LPMSO

by Chiara Piombi

Generalizing LCL Complexity Gaps to Unbounded Degree via Monadic Second-Order Properties

The ω(log n)-n^{o(1)} gap and per-k polynomial gaps hold for problems locally verifiable via Presburger monadic second-order formulas.

abstract click to expand

The last decade of research on the LOCAL model has seen tremendous progress in understanding locally checkable labeling (LCL) problems, culminating in an almost complete classification of the possible complexities LCL problems can exhibit. In particular, on undirected trees, Chang and Pettie showed that there is no LCL problem with complexity between $\omega(\log n)$ and $n^{o(1)}$ and Chang showed that, for every positive integer $k$, there is no LCL problem with complexity between $\omega(n^{1/(k+1)})$ and $o(n^{1/k})$; additionally, which side of each gap a problem is found on is decidable. While the class of LCL problems - which, roughly speaking, consists of problems for which the correctness of a solution can be described by a finite set of allowed node configurations, which in turn can be locally verified by a constant-time algorithm - includes many important problems, it has one major restriction: problems can be defined only on bounded degree graphs, which consequently restricts all the classification and gap results mentioned above. In this work, we propose a generalization of LCL problems to unbounded degree using Presburger monadic second-order (PMSO) formulas; more specifically, we consider what we call Local PMSO (LPMSO) problems, i.e., problems whose correct solutions are both finitely described by a PMSO formula and locally verifiable by a LOCAL algorithm in constant time - this class contains many of the important problems studied in the LOCAL model but defines them on unbounded degree graphs. As our main result we prove that, on unbounded degree rooted trees, the aforementioned $\omega(\log n)$ - $n^{o(1)}$ and $\omega(n^{1/(k+1)})$ - $o(n^{1/k})$ complexity gaps (and their decidability) extend to the class of LPMSO problems.

0

cs.CR 2026-06-10

Layer order in pipelines reduces to deterministic edit automata

by Faruk Alpay, Taylan Alpay

Layer Order Semantics for Automata-Based Cybersecurity

The model separates recognition from faithful online enforcement and pins desync failures to specific marker factors under reorder congruenc

abstract click to expand

Layered cybersecurity pipelines transform evidence before they decide on it, and the order of those transformations determines which security facts become visible to which layer. This paper gives layer order a finite-state semantics built from a layer-order automaton, deterministic sequential security transducers, evidence markers, and a final decision automaton. The worked case is HTTP request desynchronization: front-end and back-end processors compute incompatible request boundaries, and the same trace is detected or missed according to whether framing evidence reaches the parser-differential layer before it commits. The results separate completed-trace recognition, online editing, decision synthesis, and faithful enforcement; characterize faithful online enforcement as the regular prefix-closed case under causal visibility; and show that regular policies beyond that boundary remain recognizable without becoming deployable enforcers. The framework is monolithically equivalent to finite-output deterministic edit automata, while preserving layer-local invariants such as marker birth, marker survival, and reorder-sensitive visibility. A concrete parser-pair semantics identifies the forbidden marker factor with CL.TE, TE.CL, TE.TE, and HTTP/2-downgrade boundary disagreement under the stated abstraction, and a contextual reorder congruence classifies which component permutations induce the same decision language. The result is an automata-theoretic account of order-sensitive security failures and a compositional vocabulary for auditing, synthesizing, and comparing layered enforcement pipelines.

0

cs.FL 2026-06-09

Worsening trick yields short formulas for rewrite rule transducers

by Mans Hulden, Michael Ginn

Compiling Rewrite Rules to Finite-State Transducers with the Worsening Trick

It generates and filters candidates to handle contexts and weights while matching prior results.

abstract click to expand

Finite-state transducers (FSTs) are essential for modeling string rewriting in computational linguistics and natural language processing (NLP), particularly for phonological and morphological rewrite rules. Compiling general rewrite rules of the form $A \to B / L \, \_ \, R$, where $A$, $B$, $L$, and $R$ are arbitrary regular languages, is complex due to overlapping matches and context constraints. Traditional methods, such as those by Kaplan and Kay or Karttunen, rely on intricate transducer compositions with auxiliary markers. This paper presents a compact compilation scheme based on the "worsening trick'': generate all legal rewrite candidates, then filter candidates that are worse than another candidate for the same input. Implemented as the built-in rewrite compiler in PyFoma, the construction supports multiple contexts, arbitrary transductions, markup, directed rewriting, weights, and parallel rewriting. The resulting formulas are short and uniform, and where semantics coincide, they reproduce the same rule transducers as earlier approaches while remaining easier to extend. The implementation has been validated against foma on both a substantial collection of rewrite grammars and an automated regression suite covering the major rewrite modalities, with the resulting transducers matching exactly apart from state numbering.

0

cs.CL 2026-06-09

Causal intervention fixes flawed estimates of language task learnability

by Vésteinn Sn{ae}bjarnarson, Anej Svete +4 more

Causally Evaluating the Learnability of Formal Language Tasks

Correlational analysis alone misjudges how much data is needed once hidden frequency confounders are isolated.

abstract click to expand

Language models, as multi-task learners, acquire a wide range of abilities during training. A fundamental question is how much task-specific data is needed to learn a given task. Answering this for natural language is difficult: tasks are hard to delineate and can confound one another. To rigorously investigate the relationship between data frequency and learnability, we turn to a controlled setting using formal languages induced from probabilistic finite automata. These serve as a methodological testbed to demonstrate that standard correlational evaluation practices are inherently flawed. To enable causal analysis, we introduce the binning semiring, an algebraic object that lets us control how often a targeted property occurs in a sampled corpus. We formulate the experimental pipeline as a causal graphical model and derive decomposed Kullback-Leibler divergence metrics to measure the learnability of specific sub-tasks. Our experiments show that evaluating learnability without causal intervention leads to incorrect conclusions due to confounders in correlational analysis, and serve as a warning about correlational pitfalls in natural-language settings.

0

cs.CL 2026-06-09

Infinitely many low-rate hallucinations raise recall in the limit

by Irene Strauss, Alexandra Butoi +1 more

Generating in the Limit with Infinitely Many Hallucinations

Asymptotic validity keeps precision at one and strictly improves coverage when the adversary permanently withholds part of the target langua

abstract click to expand

The classic paradigm of language identification in the limit models learning as a game between an adversary, who reveals strings from an unknown target language, and a learner tasked with identifying that language. The recently introduced framework of language generation in the limit shifted the objective to better reflect modern language modeling, requiring the learner to produce valid, unseen strings from the target language. Related work highlighted a fundamental tension: a broad coverage of the target often comes at the cost of validity. We introduce a new notion of precision and recast this problem as the classic recall-precision trade-off. We analyze generation in the limit under varying constraints on enumeration, novelty, and validity, aimed at reflecting settings closer to those encountered by large language models. A key contribution is our analysis of learners that are not eventually valid: we allow infinitely many mistakes, provided their frequency tends to zero so that precision remains one. We show that this relaxation can strictly increase recall when the adversary permanently withholds a large portion of the target language. We also study a continuous relaxation of the novelty constraint that requires only a fixed fraction of outputs to be novel. Taken together, our results move toward a more realistic model of language generation where occasional errors and repetitions are unavoidable, but their rates are controlled.

0

nlin.CG 2026-06-09

Row-column field yields long transients without local rules

by Xiaorui Shi, Mengsha Huang

Dynamics in a Low-Rank Separable Field Cellular Automaton

A normalized separable coupling organizes cellular automaton behavior into phases with a ridge of extended transients separating two cycle r

abstract click to expand

Complex collective dynamics in cellular automata are usually associated with local-neighborhood combinatorics, yet it remains unclear whether long-lived dynamical organization requires such explicit local interaction structure. Here, we introduce a Separable-Field Cellular Automaton (SFCA), a normalized-field cellular automaton in which local neighbor counting is replaced by a rank-one-like row-column field. Each cell is updated according to a normalized field, with survival and birth governed by two threshold intervals. Systematic scans over interval widths and positions revealed four outcome classes: extinction, fixed points, cycles, and long transients. The outcome phase diagram was organized by the relative geometry of the survival and birth intervals: fixed points dominated when born interval was contained in survival interval, whereas long transients concentrated near the boundary between partial overlap and no overlap. A fine scan along this transition showed that the long-transient region forms a narrow but persistent ridge separating two qualitatively distinct cycle-dominated regimes. One side produced dense, high-change-rate cycles approximating global period-2 alternation, whereas the other produced sparse, low-change-rate, stripe-like cycles. Damage-spreading further supported a basin-competition interpretation, in which the long-transient ridge reflects delayed selection between two cyclic attractor families rather than random nonconvergence, while finite-size analysis shows that the long-transient ridge remains robust across tested grid sizes. These results show that structured long-transient dynamics can arise under compressed separable field coupling, suggesting that nontrivial collective organization does not necessarily require full local-neighborhood combinatorics.

0

cs.FL 2026-06-08

GLR parsers show only 3x slowdown vs LR(1) on deterministic grammars

by Huan Vo, Danushka Liyanage +3 more

An Empirical Comparison of General Context-Free Parsers

Benchmark of six general algorithms on 22 real grammars finds narrow variance and positions GLR as practical default.

abstract click to expand

Parsing underpins a vast range of software engineering tasks, from compilers and static analyzers to language servers and fuzz testing tools. Yet most parsers deployed in practice are deterministic (LL or LR), forcing developers not only to contort their grammars to fit the parser, but to simplify the very languages they design sacrificing expressiveness for the sake of parseability. General context-free parsers eliminate this constraint. Yet, despite decades of algorithmic development, no rigorous head-to-head comparison exists across the major families of parsing algorithms. We present the first unified, controlled benchmark of six generalized parsing algorithms: CYK, Valiant, Earley, GLL, RNGLR, and BRNGLR, plus deterministic LL(1) and LR(1) baselines, all implemented in Rust with shared data structures and parse-tree extraction, and evaluated across 22 grammars ranging from simple expressions to full C++ and Java. Our results show that the cost of generality is lower than widely assumed. On deterministic grammars, the GLR family incurs only a 3x median slowdown over LR(1), with a narrow and predictable variance. GLR is the clear performance winner among generalized parsers and a practical default choice for software engineering tools.

0

cs.AI 2026-06-08

LTLf injection raises constraint satisfaction in RL policies

by Ashkan Ansarifard (1), Matteo Mancanelli (1) +2 more

Neuro-Symbolic Injection of LTLf Constraints in Autoregressive Reinforcement Learning Policies

Differentiable DFA signals improve adherence to safety-reachability specs while returns stay competitive with vanilla baselines.

abstract click to expand

In this work we study offline reinforcement learning (RL) under temporally extended task constraints expressed in Linear Temporal Logic over finite traces (LTLf). Recently, transformer-based approaches such as Trajectory Transformers and Decision Transformers have been adopted to address RL as a sequence modeling problem. However, these methods optimize purely for reward and do not account for high-level temporal requirements. Here, we introduce a neurosymbolic framework that injects LTLf background knowledge into such transformer-based RL policies. Our approach compiles LTLf formulas into deterministic finite automata (DFAs) and integrates them into the learning process through a differentiable representation and a logic-based loss function. In particular, we derive differentiable satisfaction signals from DFA progression and use them as a regularization term during training. The resulting method is architecture-agnostic across different models. We evaluate the proposed framework on navigation environments with specification suites covering combinations of safety and reachability temporal properties. Experimental results show that incorporating background knowledge not only improves constraint satisfaction, but also maintains competitive return compared to vanilla baselines.

0

math.CO 2026-06-08

Non-periodic words satisfy Pal(n) log(Pal(n)) = o(ρ(n))

by Jeffrey Shallit

Palindrome complexity versus factor complexity

The scaled ratio of palindrome complexity to factor complexity tends to zero at infinity for words that are not ultimately periodic.

abstract click to expand

Let ${\bf x} = (a_i)_{i \geq 0}$ be an infinite word over a finite alphabet $\Sigma$. Let $\rho (n)$ be the factor complexity function for $\bf x$ and ${\rm Pal}(n)$ be the palindrome complexity function for $\bf x$. We give a new relationship between these two quantities; namely, if $\bf x$ is not ultimately periodic, then $$ \lim_{n \rightarrow \infty} {{ {\rm Pal} (n) \log ({\rm Pal} (n) + 1)} \over {\rho (n)}} = 0. $$ Furthermore, we prove that the numerator in this result is essentially optimal.

0

cs.DS 2026-06-08

Unary MSO queries on streamed trees answered earliest in constant time

by Mateusz Gienieczko, Martín Muñoz +2 more

Earliest query answering over streamed trees

An iterator for node output lets the algorithm decide returns and discards as soon as the stream guarantees the outcome.

abstract click to expand

Streaming allows executing queries over massive JSON or XML documents whose size makes it infeasible to fully parse them into a tree. Earliest query answering is a radical approach to reducing latency and memory footprint. To minimize latency, a document node must be returned as soon as the node is guaranteed to be an answer regardless of how the document ends. Similarly, to minimize memory footprint, a node must be discarded as soon as it cannot become an answer regardless of how the document ends. For simple queries that select nodes based on the path from the root, the decision for each node can be made on the spot, but practical languages such as XPath or JSONpath support filters, which allow selecting nodes based on information collected from various parts of the document, possibly further down the stream. This makes earliest query answering a challenging task, as candidate nodes must be kept in memory until it becomes clear that they can be safely returned or discarded. We show that this can be done for all unary queries expressible in monadic second order logic (MSO), while ensuring constant update time -- provided that nodes are returned by passing a suitable iterator, rather than one by one.

0

cs.LG 2026-06-08

Projected model tracks non-Abelian states perfectly to 1M tokens

by Jeonghoon Lee

A Held-Out Transition-Pair Falsifier for Long-Horizon Non-Abelian State Tracking

Trained on length-8 sequences, it succeeds on held-out transitions where GRUs and SSMs fail across million-token horizons.

abstract click to expand

State tracking exposes a sharp limitation of sequence models: the relevant signal is often not a summary of observed tokens, but an ordered latent state that evolves through non-commutative transformations. We introduce a held-out transition-pair falsifier for finite non-Abelian group tracking. The protocol forbids selected ordered generator pairs during training and requires the same local patterns during evaluation, blocking one direct local-transition memorization pathway. In a controlled $S_3 \times S_3$ benchmark, a projected recurrent state model trained only on length-8 sequences produces error-free final-state predictions (perfect 250/250 per horizon) through evaluation horizons up to 1,048,576 tokens across five seeds. Matched native-readout baselines, including bag, GRU, and a single-configuration structured state-space model, remain near floor under the same protocol. Projection-matched GRU, structured SSM, and bag baselines equipped with analogous finite-group prototype readouts also remain near chance under the same split. Mechanism diagnostics show that hard projection coincides with low homomorphism error, low state-consistency drift, and non-trivial commutator separation, while softened projection collapses final-state accuracy. Clean-split audits verify zero verbatim reduced-word overlap and zero structural-template overlap between training and evaluation partitions. The evidence is scoped to this controlled finite-group falsifier rather than to a general architecture ranking. Within that regime, explicit projected non-commutative state composition acts as a useful inductive bias for long-horizon hidden-state tracking.

0

cs.FL 2026-06-08

2011 diagnosability algorithm fails without deadlock-free assumption

by Kuize Zhang

A remark on diagnosability verification

Counterexample shows the polynomial procedure errs on LFSAs with dead states or divergences; co-diagnosability is PSPACE-hard.

abstract click to expand

We point out three inaccuracies in paper [M.V. Moreira, T.C. Jesus, and J.C. Basilio. Polynomial time verification of decentralized diagnosability of discrete event systems. IEEE Transactions on Automatic Control, 56(7):1679-1684, July 2011]. First, the authors wrongly claimed that their algorithm for verifying (co-)diagnosability of labeled finite-state automata (LFSAs) did not depend on assumptions. We give an LFSA that is not deadlock-free or divergence-free such that their algorithm cannot correctly verify its diagnosability. Because diagnosability is a special case of co-diagnosability, their algorithm cannot correctly verify co-diagnosability either when LFSAs are not deadlock-free or divergence-free. Second, they wrongly claimed that adding at each dead state an unobservable self-loop can help verifying diagnosability for an LFSA that is not deadlock-free or divergence-free, but this is wrong, because such a modification sometimes changes the diagnosability of an LFSA. Third, they wrongly claimed that their algorithm for verifying co-diagnosability ran in polynomial time. A polynomial-time algorithm unlikely exists, because the problem of verifying co-diagnosability of LFSAs is PSPACE-hard.

0

cs.FL 2026-06-05

SAI learns interval-predicate automata from samples with polynomial data

by Erwann Loulergue, Peter Habermehl

Passive Learning of Symbolic Automata over Monotonic Algebras

The algorithm merges and splits states to identify deterministic symbolic automata over monotonic algebras in the limit.

abstract click to expand

Symbolic automata extend classical finite-state automata to handle large or infinite alphabets by labeling transitions by predicates coming from a boolean algebra. Many results from automata theory have been lifted to this model, and it has proved its usefulness for example in multiple software verification applications. Here, we tackle the passive learning problem of identification in the limit, i.e. learning a model from a sample without access to an oracle to query. We provide an algorithm, SAI, that efficiently identifies in the limit symbolic automata over any monotonic algebra where predicates labeling transitions are of the form a <= x < b. The algorithm extends the RPNI framework for passive learning of finite-state automata to symbolic automata thanks to a new splitting operation inspired by RTI, a passive learning algorithm for deterministic real-time automata, a subclass of timed automata. The learning algorithm combines merging of states and splitting of states allowing to infer the predicates on transitions in a top-down fashion. We prove that SAI admits polynomial size characteristic samples.

0

cs.FL 2026-06-05

Event-B gains non-intrusive timed clocks for correct-by-construction design

by Guillaume Dupont, Jun Sun

Correct-by-Construction Design of Timed Systems in Event-B

Embedding timed automata semantics extends refinement to timing, letting complex real-time systems be built incrementally with full tooling.

abstract click to expand

Real-time systems require the careful handling of timing aspects in their models. For critical applications, this entails the use of time-aware formal methods. Currently, such formal methods only account for timing and communication layers, excluding functional aspects. Thus, they are intended to be used as a posteriori analysis methods, on systems that have already been developed. In contrast, methods such as Event-B have been designed to build systems incrementally using a correct-by-construction approach, but are not equipped with the ability to express timing aspects and constraints. We propose a non-intrusive, tool-supported embedding of time and clocks in Event-B inspired by the features and semantics of timed automata. This enables the design of complex real-time systems while benefiting from the entire ecosystem and tooling support of the method. Refinement is extended to also take time into account, making it possible to design complex systems gradually in a correct-by-construction manner while integrating timing aspects from the top level. The embedding and associated methodology are illustrated on a case study, showcasing both how timed Event-B models may be derived from timed automata, how the extended expressivity of first-order logic and set theory at the core of Event-B enables finer modelling, and how timed refinement may be used to establish complex timing properties.

0

math.CO 2026-06-02

Equal s-substrings in cyclic Fibonacci word are fully characterized

by Takashi Horiyama, Yasuhide Numata +2 more

On gapped repeats in a cyclic Fibonacci word

The pairs of starting indices yielding identical segments of length s are described and their quantity is computed exactly.

abstract click to expand

In this article, we consider the words with cyclic indices. For given $s$, we consider the pair $(\iota,\kappa)$ of indices such that the word of length $s$ from $\iota$ is equal to the word of length $s$ from $\kappa$. We give a characterization of such pairs for a cyclic Fibonacci word, and give the number of them.

0

cs.FL 2026-06-02

Arithmetic model decides what languages RNNs recognize

by Franz Nowak, Ryan Cotterell +1 more

An Algebraic View of the Expressivity of Recurrent Language Models

Floating-point blocks even-modulus counters in diagonal models; unsigned integers allow every such counter.

abstract click to expand

What formal languages can a recurrent neural language model recognize? Formal results in the literature conflict: some authors report Turing-completeness, while others show equivalence to regular languages. The reason for this discrepancy is that the underlying arithmetic model differs. The paper develops a unified algebraic account of the expressivity of recurrent neural networks, starting with a formal account of various arithmetic models. This account reduces expressivity to an algebraic question, e.g., whether a network's syntactic monoid divides a certain wreath product. As a case study, the paper revisits diagonal state-space models: the same architecture cannot implement an even-modulus counter once floating-point recurrences are enforced, yet realizes every even-modulus counter under unsigned-integer quantization.

0

cs.PL 2026-06-01

Grid programs compute Turing-completely without variables

by Ezequiel López-Rubio

Grid Programs: A Two-Dimensional, Variable-Free Model of Computation

A 2D grid of instructions navigated by a pointer uses stacks and a linked list to simulate any register machine.

abstract click to expand

We introduce Grid Programs, a novel model of computation in which programs are finite two-dimensional arrangements of instructions on an integer grid rather than linear sequences of statements. Three properties distinguish this model fundamentally from classical frameworks: (i) programs are planar structures through which an instruction pointer moves in the four cardinal directions; (ii) there are no syntax constraints, so any assignment of instructions to grid cells constitutes a valid program; and (iii) the model uses no named variables or explicit memory addresses. Program state is maintained through a data stack, an address stack, and a circularly doubly linked list accessed via three named pointers. Control flow is achieved spatially, with branching encoded as perpendicular turns of the instruction pointer. The address stack stores triplets (cell row, cell column, direction), enabling precise restoration of both position and heading after branches, loops, and function calls. We give a formal operational semantics, present a representative instruction set covering arithmetic, control flow, and linked-list manipulation, and work through several detailed examples, including an absolute-value function, a factorial computation, a linear-search algorithm, a string-reversal program, and a while-loop summation. We establish that Grid Programs are Turing-complete by simulating an arbitrary register machine, and we discuss their relationship to prior two-dimensional languages such as Befunge and Funge-98, to stack-based languages such as Forth and PostScript, and to dataflow and spatial computation models. Grid Programs offer a fresh vantage point for exploring the design space of computation, with potential applications in visual programming environments, cellular-automaton-inspired hardware, and obfuscation-resistant code.

0

cs.FL 2026-06-01

k-bandwidth NFAs form strict language hierarchy

by Da-Jung Cho, Szilárd Zsolt Fazekas +4 more

Bandwidth of Nondeterministic Finite Automata

Bandwidth 2 recognizes every finite language while minimization is NP-hard for fixed k at least 2

abstract click to expand

Co-transcriptional splicing generates RNA sequences from a DNA template by deleting subsequences nondeterministically. Recent work showed how to encode an NFA into such a template, but the construction requires deleting subsequences whose length grows with the distance between states, which makes such deletions unlikely under the local nature of co-transcriptional splicing. We introduce $k$-bandwidth NFAs, in which transitions span at most $k$ states. These automata form a strict hierarchy of language classes. For finite languages, bandwidth $2$ suffices, and bandwidth $1$ can be decided in polynomial-time when the language is presented as a list of words. Minimizing the bandwidth is NP-hard even for fixed $k \geq 2$.

0

cs.LG 2026-05-29

Padded transformers match AC^0 or TC^0 by precision and depth

by Anej Svete, William Merrill +2 more

Revisiting Padded Transformer Expressivity: Which Architectural Choices Matter and Which Don't

Constant precision reaches L-uniform AC^0 and growing precision reaches TC^0, independent of width or attention type.

abstract click to expand

Recent work describes what transformers can and cannot compute through connections to boolean circuits, but existing results lack exact characterizations and are sensitive to modeling choices. Padded transformers -- to whose input filler symbols such as ``...'' are appended -- emerge as a useful gadget for establishing equivalences to circuit classes by providing polynomial space for adaptive parallel computation. However, only a limited set of padded transformer idealizations has been studied, leaving open how robustly these equivalences hold under changes to attention type, model width, and uniformity. We find that, under practical assumptions, padded transformers are surprisingly robust to all of these, and identify numeric precision and model depth as the main factors affecting expressivity. Concretely, we prove that polynomially padded $\text{L-uniform}$ constant-precision transformers are equivalent to $\text{L-uniform AC}^0$, while growing-precision ones achieve $\text{L-uniform TC}^0$ regardless of width. Furthermore, looping enables sequential processing analogous to circuits: $\log^d N$-looped constant-precision transformers reach $\text{FO-uniform AC}^d$, and growing-precision ones reach $\text{FO-uniform TC}^d$. Interestingly, growing width or precision beyond logarithmic does not increase expressivity, and all our results hold for both softmax and average hard attention transformers.

0

cs.FL 2026-05-29

One adaptive head beats any number of oblivious heads

by Julianne Cruz, Sho Glashausser +1 more

One Adaptive Trailing Head Can Outperform Many Oblivious Trailing Heads

On some binary sequences the dimension gap exceeds 0.3, showing adaptivity cannot be replaced by extra fixed heads.

abstract click to expand

In the setting of multi-head finite-state dimensions, trailing heads lag behind a leading head, accessing past data to aid a finite-state gambler placing bets on successive bits read by the leading head. Cruz, Glashausser, Li, and Lutz (2026) proved that, for any fixed number of trailing heads, adaptive (data-dependent) movement rules can strictly outperform oblivious (data-independent) movement schedules. In this paper we strengthen that separation by proving that a single trailing head with adaptive movements can outperform, by a large and uniform margin, arbitrarily many trailing heads with oblivious movements. Formally, our main theorem states that there is a binary sequence whose adaptive two-head finite-state strong dimension is less than its oblivious multi-head finite-state dimension, and that the gap is greater than 0.3.

0

cs.FL 2026-05-29

Pattern inclusion depth computation remains open

by Wei Luo

The Inclusion Depth of Pattern Languages: An Open Problem in Algorithmic Learning Theory

The longest inclusion chain length for any pattern may equal 2 times length minus variables minus one, but this is unproven and the general

abstract click to expand

Pattern languages are a classical model in formal language theory and algorithmic learning theory. This note formulates the problem of computing the inclusion depth of a pattern language: the length of the longest strict inclusion chain from the universal pattern language to the language generated by a given pattern. Inclusion depth captures the mind-change complexity of pattern identification from positive data. The central open question is whether the inclusion depth ID_Sigma(p) is computable for every pattern p over every finite alphabet Sigma with at least two symbols, and whether it is computable in polynomial time. A simple conjectured formula, ID_Sigma(p) = 2|p| - #var(p) - 1, would imply a linear-time algorithm. The problem connects pattern language inclusion, combinatorics on words, language identification in the limit, and mind-change-bounded learning.

0

math.CO 2026-05-28

Ten binary squares force an overlap

by Jeffrey Shallit

Ten Squares Force an Overlap

The smallest number that guarantees an overlap pattern in any concatenation over two letters, while nine is possible and three letters allow

abstract click to expand

We prove that every concatenation of $10$ or more binary squares contains an overlap. The bound $10$ is best possible. In contrast, over a ternary alphabet, there are infinitely long overlap-free words that consist of a concatenation of squares.

0

q-fin.CP 2026-05-28

Automata product yields exact payoff bounds for constrained scenarios

by Vitaly Nürnberg

Scenario Constraints with Memory: A Finite-State Approach to Quantitative Financial Analysis

Synchronized EHAs and WFFAs replace simulation estimates with guaranteed upper and lower payoffs plus explicit witness histories for structu

abstract click to expand

Quantifying worst-case and best-case performance under complex market scenarios is a persistent challenge in financial risk management and the verification of path-dependent financial instruments, such as exotic options and structured products. Simulation-based methods are well suited for probabilistic estimation, but they do not directly provide exhaustive guarantees over all admissible scenarios or explicit witnesses for extremal outcomes. To address this, we introduce a quantitative automata-based framework for the exact extremal analysis of financial systems under declarative scenario constraints. At the core of our approach are event history automata (EHAs), a new formal model that integrates regular-expression event patterns with admissible numerical intervals to represent constrained event histories with memory. Quantitative payoffs are represented by weighted finance finite automata (WFFAs), which allow transition weights to depend on observed market values. By computing the synchronized product of EHAs and WFFAs, our framework enables the exact calculation of upper and lower payoff bounds. Furthermore, the method automatically extracts interpretable witness event histories that realize these extremal outcomes. We demonstrate the practical viability of the approach through a case study of an autocallable structured product with path-dependent mechanisms. The case study analyzes how different scenario constraints affect coupon accumulation, early redemption, and protection-loss outcomes. Scalability experiments indicate that the framework's execution remains computationally feasible for practical contract horizons and nontrivial constraint configurations. Overall, this approach provides a mathematically rigorous complement to standard financial simulation methods.

0

cs.FL 2026-05-27

Cones induce right-stable carriers for quantitative language observations

by Faruk Alpay, Baris Basaran

Cone-Induced Observation Congruences for Vector-Valued Quantitative Languages

Extreme rays of the dual cone classify residuals into sign cells to produce canonical carriers from order cone geometry on the residual span

abstract click to expand

We study the observation congruences induced by rational polyhedral cones on vector-valued quantitative languages. The extreme rays of the dual cone define intrinsic covectors, and these covectors classify every incremental residual future by a finite sign cell: negative, tight, or positive along each extremal Farkas direction. The resulting carrier is the right-stable carrier of this cone-induced observation family, whose source is canonical: the restricted covector geometry of the order cone on the residual span of the language. We organize this construction through an observation-refinement correspondence, a cone-refinement calculus, and a separation between the qualitative conic observation quotient and the numerical residual carrier needed for potential certificates. A bounded-horizon fragment is fully computable by enumeration of accumulated futures, and reproducible evaluation runs show how the conic layer detects qualitative obstruction cells before numerical refinement.

0

cs.LO 2026-05-27

Tree is epsilon-accepted exactly when within bisimulation distance epsilon of a rigid tree

by Anita Moyasari, Harsh Beohar +2 more

Tree Automata Acceptance up to Measurable Defect

The equivalence turns defect-budget verification into a standard acceptance question on a nearby tree.

abstract click to expand

Automata acceptance can, in several situations of interest, be captured game-theoretically via acceptance games. The existence of a winning strategy for Verifier then captures the existence of a winning run-tree of a given automaton over a model. However, such acceptance is rigid, in that it does not allow a measurable defect budget, which can be a challenge in software verification. In this paper, we draw inspiration from how bisimulation distance can be defined as an extension of bisimilarity to define epsilon-acceptance games. Our main theorem shows that a tree T is epsilon-accepted iff there is a tree T' that is accepted in the traditional (rigid) sense and the bisimulation distance of T' and T is at most epsilon. Our work also suggests a strong connection with measure theory, of which we give a preliminary exploration via appropriate examples. Our framework is defined over binary trees with leaves and infinite branches, and strictly contains the case in which binary nodes are seen as probabilistic choice and the defect measures the probability of the set of rejected branches.

0

math.CO 2026-05-27

Every graph is 2-word-π-representable

by Duncan Adamson, Amanita Dietz +3 more

2-word-π-representable Graphs

An algorithm returns two words whose letter projections define the edges of any input graph.

abstract click to expand

This paper investigates the new notion of $2$-word-$\pi$-repre\-sentable graphs: the nodes of the graph correspond to the letters of the two words and there exists an edge between two nodes if the projections of any two letters of both words are equal. The benefit of not only using one word for a representation as introduced by Kitaev and Pyatkin is that every graph is $2$-word-$\pi$-representable. We present an algorithm that returns two representing words for any graph. Aside, we show that every permutation graph is representable by two $1$-uniform words and give constructions how graph operations on $2$-word-$\pi$-representable graphs can be realised on their representing words which give further insights into the representation of cographs.

0

cs.LO 2026-05-27

Almost fair simulations suffice for many liveness proofs

by Arthur Correnson, Iona Kuhn +1 more

Almost Fair Simulations

Controlled fixed-point alternations give intuitive rules for proving fair trace inclusion without full simulation complexity

abstract click to expand

It is well known that liveness properties cannot be proven using standard simulation arguments. This issue has been mitigated by extending standard notions of simulation for transition systems to fairness-preserving simulations for systems equipped with an additional fairness condition modeling liveness assumptions and/or liveness requirements. In the context of automated verification of finite-state systems, proofs by simulation are an appealing method as there exist efficient algorithms to find a simulation between two systems. However, applications of fair simulation to interactive verification have been much less studied. Perhaps one reason is that the definitions of fair simulation relations typically involve non-trivial nestings of inductive and coinductive relations, making them particularly difficult to use and to reason about. In this paper, we argue that in many cases, stronger notions of fair simulation involving more controlled alternations of fixed points are sufficient. Starting from known fair simulation techniques, we progressively build up a family of almost fair simulation relations for transition systems equipped with a Buechi fairness condition. The simulation relations we present can all be equipped with intuitive reasoning rules, leading to elegant deductive systems to prove fair trace inclusion. We mechanized our simulation relations and their associated deductive systems in the Rocq proof assistant, proved their soundness, and we demonstrate their use through a selection of examples.

0

cs.FL 2026-05-25

Pomset languages regular exactly when recognized by finite categories

by Enzo Erlich, Jérémy Ledent +1 more

Algebraic Characterization of FO-definable Languages of Higher-Dimensional Automata

The algebraic view generalizes McNaughton-Papert by tying first-order definability of concurrent languages to aperiodic categories.

abstract click to expand

Higher-dimensional automata (HDA) are a model of concurrency that models simultaneous execution of events using higher dimensional cells. HDA recognize languages of pomsets, a generalization of finite words whose letters are partially ordered. We prove a new algebraic characterization of HDA languages: a language of pomsets is regular if and only if it is the inverse image of a functor from the category of pomsets into a finite category. Furthermore, the language is definable in first-order logic exactly when it is recognized by an aperiodic category, generalizing the McNaughton-Papert theorem to HDA languages. We also investigate a notion of counter-free HDA, and show that if a language is accepted by a counter-free HDA, it must be definable in first-order logic. The converse, however, is still open.

0

cs.FL 2026-05-25 Recognition

Tool generates dependent event sequences using DFAs and transducers

by Alexander Obeid Guzman

AGDES: Automatic Generation of Dependent Event Sequences

Sequences are DFA output words or transducer outputs from input sequences, formalizing dependency generation.

abstract click to expand

This note presents AGDES, a tool for Automatic Generation of Dependent Event Sequences. Each event sequence is either generated as the output word of a deterministic finite automaton (DFA), or produced as the output word of a DFA called transducer that reads events from one or more input sequences, and produces an output sequence.

0

math.CO 2026-05-21 2 theorems

Rado graph has the finite length property

by Jingjie Yang, Miko{l}aj Bojańczyk +1 more

The Finite Length Property of the Rado Graph and Friends

Chains of equivariant subspaces in its free vector spaces are bounded, extending earlier results for sets and orders.

abstract click to expand

An infinite structure has the finite length property (over a given field) if, for each of its finite powers, chains of equivariant subspaces in the corresponding free vector space are bounded in length. Prior work showed that the countable pure set and the countable dense linear order without endpoints have this property. We generalise these results to (a) any structure approximated by finite substructures with few orbits, provided the field is of characteristic zero, and (b) any Fra\"iss\'e limit with free amalgamation in a finite vocabulary consisting of unary and binary relations, possibly expanded with a generic total order. As a special case, we deduce the finite length property of the Rado graph using both methods. We also describe some connections with function spaces, weighted register automata, and orbit-finite systems of linear equations.

0

cs.LO 2026-05-21 Recognition

Tool verifies probabilistic logic in stochastic systems 10-1000x faster

by Paapa Kwesi Quansah, Ernest Bonnah

SENTIL: A Runtime Verification Tool for Probabilistic Temporal Logic

SENTIL supplies statistical guarantees during live operation and integrates directly with ROS, MATLAB, and embedded platforms.

abstract click to expand

Stochastic cyber-physical systems (CPS) permeate critical infrastructure, from autonomous vehicles to medical devices. Yet, tools for runtime verification of such systems capturing the probabilistic dynamics in stochastic systems remain generally absent despite theoretical foundations established nearly a decade ago. In this paper, we present SENTIL, a novel runtime verification tool with provable statistical guarantees for the runtime monitoring of requirements expressed as Probabilistic Signal Temporal Logic (PrSTL). SENTIL combines an efficient Rust core with universal ecosystem integration, delivering performance exceeding existing deterministic monitors while providing rigorous probabilistic guarantees through statistical model checking, sequential probability ratio testing, and adaptive rare event estimation. SENTIL employs streaming algorithms for incremental robustness computation, parallel Monte Carlo sampling, and a language-agnostic C-ABI enabling seamless deployment across ROS, Apollo, MATLAB Simulink, and AUTOSAR platforms, and direct integration in C, C++, Python, and Java. To validate the effectiveness of the proposed tool, we validate SENTIL across various scenarios spanning autonomous vehicle monitoring, medical device validation, and biological networks, demonstrating 10-1,000$\times$ performance improvements over existing tools while maintaining provable confidence intervals. SENTIL is open source (\href{https://github.com/sedislab/SENTIL}{\texttt{sedislab/SENTIL}}) and it positions probabilistic runtime verification as a deployable infrastructure for all real-world safety-critical stochastic systems.

0