FLIPS identifies LLM instances with 96% closed-set and 90% open-set accuracy by exploiting biases in generated binary random sequences across 237 instances.
Sok: Large language model copyright auditing via fingerprinting.arXiv preprint arXiv:2508.19843
5 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 5representative citing papers
Misrouter enables input-only attacks on MoE LLMs by optimizing queries on open-source surrogates to route toward weakly aligned experts and transferring them to public APIs.
SeedPrints fingerprints LLMs using persistent biases from initialization seeds for lineage verification across pretraining and adaptation stages.
NightVision recovers LLM hidden dimension to 23% average relative error (9% on MoE) and depth/parameter count to 53% on models >3B parameters using common-set prompting, spectral analysis, and TTFT under single-logit black-box access.
BiCoT embeds watermarks into the internal geometry of Chain-of-Thought reasoning traces in LLMs via private signature subspace alignment and introduces Robust Subspace Registration for black-box verification under attacks.
citing papers explorer
-
FLIPS: Instance-Fingerprinting for LLMs via Pseudo-random Sequences
FLIPS identifies LLM instances with 96% closed-set and 90% open-set accuracy by exploiting biases in generated binary random sequences across 237 instances.
-
Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs
Misrouter enables input-only attacks on MoE LLMs by optimizing queries on open-source surrogates to route toward weakly aligned experts and transferring them to public APIs.
-
SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From
SeedPrints fingerprints LLMs using persistent biases from initialization seeds for lineage verification across pretraining and adaptation stages.
-
Black-Box Inference of LLM Architectural Properties with Restrictive API Access
NightVision recovers LLM hidden dimension to 23% average relative error (9% on MoE) and depth/parameter count to 53% on models >3B parameters using common-set prompting, spectral analysis, and TTFT under single-logit black-box access.
-
Echoes within the Reasoning: Stealthy and Effective Watermarking via Chain of Thought
BiCoT embeds watermarks into the internal geometry of Chain-of-Thought reasoning traces in LLMs via private signature subspace alignment and introduces Robust Subspace Registration for black-box verification under attacks.