pith. sign in

arxiv: 2607.02442 · v1 · pith:Y5H4DW7Mnew · submitted 2026-07-02 · 💻 cs.SE · cs.CR

HTTP REST API Structure Learning

Pith reviewed 2026-07-03 08:26 UTC · model grok-4.3

classification 💻 cs.SE cs.CR
keywords API securityanomaly detectionREST APIunsupervised learningnetwork traffic analysisHTTP securitymalicious request detection
0
0 comments X

The pith

HRAL learns REST API endpoint structures directly from network traffic to detect anomalies without documentation or rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces HRAL as an unsupervised anomaly detection method that builds models of normal REST API behavior and structure solely by observing network traffic. It does not depend on API documentation or hand-crafted rules to identify deviations that may signal attacks. This matters in practice because many deployed APIs lack complete or up-to-date documentation, which limits the reach of security tools that require such inputs. Evaluation across different levels of documentation completeness shows HRAL reaching average recall of 82.07 percent and F1-score of 87.24 percent while outperforming prior techniques in low-documentation settings. When paired with existing signature rules such as OWASP ModSecurity CRS, detection reaches 100 percent.

Core claim

HRAL is a novel unsupervised anomaly detection approach that models the structure and behavior of API endpoints directly from network traffic, without relying on predefined rules or documentation, enabling robust detection of malicious activity by understanding how APIs behave and flagging deviations as potential threats. It achieves an average recall of 82.07% and an F1-score of 87.24%, significantly outperforming alternatives when API documentation is limited, and reaches 100% detection when combined with OWASP ModSecurity CRS.

What carries the argument

HRAL, the unsupervised model that extracts and baselines endpoint structure and behavior patterns from raw network traffic for deviation-based detection.

If this is right

  • HRAL maintains high recall and F1 scores even when OpenAPI documentation is sparse or absent.
  • Performance approaches that of systems given complete API definitions.
  • Pairing the learned model with signature rules such as OWASP ModSecurity CRS yields complete detection coverage.
  • The method supports security in real-world environments where APIs are only partially documented.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Layered defenses could treat traffic-derived models as a default baseline that requires no manual documentation updates.
  • The same traffic-driven learning pattern could be tested on non-REST protocols or internal microservice meshes.
  • If traffic patterns prove stable across versions, the approach might reduce the documentation burden on API maintainers.

Load-bearing premise

Network traffic alone contains sufficient and representative information to model normal API endpoint structure and behavior without predefined rules or documentation.

What would settle it

A controlled test on a production API where traffic logs are captured, a set of undocumented malicious requests are injected, and HRAL's detection rate falls well below the reported 82 percent recall.

Figures

Figures reproduced from arXiv: 2607.02442 by Amit Dvir, Ran Dubin.

Figure 1
Figure 1. Figure 1: OpenAPI Request-Based Algorithm endpoint representation. By detecting abnormal patterns in API requests, we can identify changes to API endpoints, new APIs, or new attacks. In our evaluation, we have the following abnormal detection behaviors: Minimal/Basic/Full OpenAPI (Supervised) spec: The ground truth evaluation method does not utilize API understanding; instead, it relies solely on a comparison with t… view at source ↗
Figure 2
Figure 2. Figure 2: OpenAPI Request-based and Spec-building Algorithm [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
read the original abstract

Application Programming Interfaces (APIs) are essential in software development, enabling web services, mobile apps, and microservices. However, their widespread use introduces significant security risks, highlighting the importance of API security. This paper presents HTTP REST API Learning (HRAL), a novel unsupervised anomaly detection approach that models the structure and behavior of API endpoints directly from network traffic, without relying on predefined rules or documentation. HRAL enables robust detection of malicious activity by understanding how APIs behave and flagging deviations as potential threats. We evaluate HRAL across varying levels of OpenAPI documentation detail and compare it with existing techniques. HRAL achieves strong performance, with an average recall of 82.07% and an F1-score of 87.24%, significantly outperforming alternatives when API documentation is limited. Moreover, our results approach the effectiveness of full API document definitions. When combined with signature-based rules such as the OWASP ModSecurity CRS, our system achieves 100% detection. These results highlight HRAL's effectiveness in real-world, partially documented API environments and its potential as a foundational layer for modern API security solutions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper presents HTTP REST API Learning (HRAL), a novel unsupervised anomaly detection approach that models the structure and behavior of REST API endpoints directly from network traffic without predefined rules or documentation. It evaluates the method across varying levels of OpenAPI documentation detail, claiming an average recall of 82.07% and F1-score of 87.24% that significantly outperforms alternatives when documentation is limited, approaches the effectiveness of full API definitions, and reaches 100% detection when combined with OWASP ModSecurity CRS.

Significance. If substantiated by rigorous evaluation, the work would represent a meaningful advance in API security by demonstrating a documentation-independent traffic-based modeling technique that can serve as a foundational layer alongside signature-based systems in partially documented environments.

major comments (1)
  1. [Abstract] Abstract: The central performance claims (average recall of 82.07% and F1-score of 87.24%) are stated without any description of the evaluation methodology, datasets, experimental protocol, baselines, or error analysis. This omission renders the quantitative results impossible to assess or reproduce.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and constructive comment on the abstract. We agree that additional context is needed there and will revise accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central performance claims (average recall of 82.07% and F1-score of 87.24%) are stated without any description of the evaluation methodology, datasets, experimental protocol, baselines, or error analysis. This omission renders the quantitative results impossible to assess or reproduce.

    Authors: We agree that the abstract would benefit from a concise description of the evaluation setup. In the revised version we will add a brief clause noting the use of real-world HTTP traffic traces, the protocol of testing across partial-to-full OpenAPI documentation levels, the comparison against signature-based and other unsupervised baselines, and that detailed methodology, results, and error analysis appear in Sections 4–5. This keeps the abstract within length limits while making the claims assessable. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The abstract and manuscript description present an unsupervised modeling approach from network traffic with reported empirical performance metrics (recall, F1-score) from comparisons to alternatives. No equations, derivations, self-citations, or fitted parameters are described that reduce any claim to its own inputs by construction. The evaluation relies on external benchmarks rather than internal definitions or self-referential predictions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5712 in / 992 out tokens · 20841 ms · 2026-07-03T08:26:54.467593+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references

  1. [1]

    Research Towards Key Issues of API Security,

    R. Sun, Q. Wang, and L. Guo, “Research Towards Key Issues of API Security,” inCNCERT, Beijing, China, July 20–21, 2021, pp. 179–192

  2. [2]

    Mobile application web api reconnaissance: Web-to-mobile inconsistencies & vulnerabilities,

    A. Mendoza and G. Gu, “Mobile application web api reconnaissance: Web-to-mobile inconsistencies & vulnerabilities,” inSP, 2018, pp. 756– 769

  3. [3]

    Operation NightScout: Supply-chain at- tack targets online gaming in Asia

    I. Sanmillan, “Operation NightScout: Supply-chain at- tack targets online gaming in Asia.” [On- line]. Available: https://www.welivesecurity.com/2021/02/01/ operation-nightscout-supply-chain-attack-online-gaming-asia/

  4. [4]

    Log4j CVE-2021-44228

    M. CVE, “Log4j CVE-2021-44228.” [Online]. Available: https: //cve.mitre.org/cgi-bin/cvename.cgi?name=cve-2021-44228

  5. [5]

    Defense-In-Depth Security Strategy in Log4j Vulnerability Analysis,

    S. Feng and M. Lubis, “Defense-In-Depth Security Strategy in Log4j Vulnerability Analysis,” inICADEIS, 2022, pp. 01–04

  6. [6]

    OpenAPI Sepcification

    O. Organization, “OpenAPI Sepcification.” [Online]. Available: https: //www.openapis.org/

  7. [7]

    Overview of machine learning processes used in improving security in api-based web applications,

    E. M. Pas ,ca, R. Erdei, D. Delinschi, and O. Matei, “Overview of machine learning processes used in improving security in api-based web applications,” inArtificial Intelligence Application in Networks and Systems, R. Silhavy and P. Silhavy, Eds., 2023, pp. 367–381

  8. [8]

    Speculator: A library for reconstructing OpenAPI specification from traffic of HTTP transactions

    OpenClarity, “Speculator: A library for reconstructing OpenAPI specification from traffic of HTTP transactions.” [Online]. Available: https://github.com/openclarity/speculator

  9. [9]

    Analysis and mitigation of nosql injections,

    A. Ron, A. Shulman-Peleg, and A. Puzanov, “Analysis and mitigation of nosql injections,”IEEE Security & Privacy, vol. 14, no. 2, pp. 30–39, 2016

  10. [10]

    Seapp: A secure application management framework based on rest api access control in sdn-enabled cloud environment,

    T. Hu, Z. Zhang, P. Yi, D. Liang, Z. Li, Q. Ren, Y . Hu, and J. Lan, “Seapp: A secure application management framework based on rest api access control in sdn-enabled cloud environment,”Journal of Parallel and Distributed Computing, vol. 147, pp. 108–123, 2021

  11. [11]

    API Security in Large Enterprises: Leveraging Machine Learning for Anomaly Detection,

    G. Baye, F. Hussain, A. Oracevic, R. Hussain, and S. Ahsan Kazmi, “API Security in Large Enterprises: Leveraging Machine Learning for Anomaly Detection,” inISNCC, 2021, pp. 1–6

  12. [12]

    Best Practices to Secure API Implementations in Core Banking System (CBS) in Banks,

    M. Ul Alam, M. A. K. Azad, and M. S. Ali, “Best Practices to Secure API Implementations in Core Banking System (CBS) in Banks,” in CCWC, 2022, pp. 0730–0735

  13. [13]

    Machine learning for detecting fraud in an API,

    A. S ´anchez Espunyes, “Machine learning for detecting fraud in an API,” 2022

  14. [14]

    Auto-encoder lstm methods for anomaly- based web application firewallall,

    A. Moradi Vartouni, S. Mehralian, M. Teshnehlab, and S. Sedighian Kashi, “Auto-encoder lstm methods for anomaly- based web application firewallall,”International Journal of Information and Communication Technology Research, vol. 11, no. 3, pp. 49–56, 2019

  15. [15]

    Auto-threshold deep SVDD for anomaly-based web application firewall,

    A. Moradi Vartouni, M. Shokri, and M. Teshnehlab, “Auto-threshold deep SVDD for anomaly-based web application firewall,” 2021

  16. [16]

    Securing microservices with deep learning — Long Short- Term Memory Autoencoder for Anomaly Detection,

    L. S. Arstila, “Securing microservices with deep learning — Long Short- Term Memory Autoencoder for Anomaly Detection,” Otakaari 24, 02150 Espoo, Finland, May 2023

  17. [17]

    The role of anomaly detection in api security: A machine learning approach,

    J. Paul, “The role of anomaly detection in api security: A machine learning approach,” 11 2024

  18. [18]

    An algorithm for suffix stripping,

    M. F. Porter, “An algorithm for suffix stripping,”Program, vol. 14, no. 3, pp. 130–137, 1980

  19. [19]

    Scalable hierarchical agglomerative clustering,

    N. Monath, K. A. Dubey, G. Guruganesh, M. Zaheer, A. Ahmed, A. McCallum, G. Mergen, M. Najork, M. Terzihan, B. Tjanakaet al., “Scalable hierarchical agglomerative clustering,” inProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 1245–1255

  20. [20]

    Ward’s hierarchical agglomerative clus- tering method: which algorithms implement Ward’s criterion?

    F. Murtagh and P. Legendre, “Ward’s hierarchical agglomerative clus- tering method: which algorithms implement Ward’s criterion?”Journal of Classification, vol. 31, pp. 274–295, 2014

  21. [21]

    HTTP DATASET CSIC 2010

    Cisco, “HTTP DATASET CSIC 2010.” [Online]. Available: https: //www.tic.itefi.csic.es/dataset/

  22. [22]

    A classification-by- retrieval framework for few-shot anomaly detection to detect api in- jection,

    U. Aharon, R. Dubin, A. Dvir, and C. Hajaj, “A classification-by- retrieval framework for few-shot anomaly detection to detect api in- jection,”Computers & Security, vol. 150, p. 104249, 2025

  23. [23]

    CVE-2021-44228,

    N. I. of Standards and Technology, “CVE-2021-44228,” 2021, https: //nvd.nist.gov/vuln/detail/CVE-2021-44228

  24. [24]

    OWSAP Log Forging,

    OWSAP-Log, “OWSAP Log Forging,” 2023, https://owasp.org/ www-community/attacks/Log Injection

  25. [25]

    OWSAP Log Injection,

    OWSAP-SQL, “OWSAP Log Injection,” 2023, https://owasp.org/ www-community/attacks/SQL Injection

  26. [26]

    OW ASP ModSecurity Core Rule Set

    OW ASP, “OW ASP ModSecurity Core Rule Set.” [Online]. Available: https://owasp.org/www-project-modsecurity-core-rule-set/#: ∼:text=The%20OW ASP%20ModSecurity%20Core%20Rule,a% 20minimum%20of%20false%20alerts