Cross-Domain Generalization Failure in Lightweight Intrusion Detection Models for IIoT Networks

MD Azizul Hakim; Md Shihab Uddin; Talha Ibne Anis

arxiv: 2607.00553 · v1 · pith:VLEVLYRHnew · submitted 2026-07-01 · 💻 cs.CR · cs.AI

Cross-Domain Generalization Failure in Lightweight Intrusion Detection Models for IIoT Networks

MD Azizul Hakim , Md Shihab Uddin , Talha Ibne Anis This is my paper

Pith reviewed 2026-07-02 11:31 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords intrusion detectionIIoTcross-domain generalizationlightweight modelsexplainabilityport featuresshortcut learning

0 comments

The pith

Lightweight IIoT intrusion detectors fail to generalize across networks because they over-rely on source-specific port features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper trains four lightweight machine learning architectures for intrusion detection on a single IIoT dataset and evaluates them without retraining on two structurally different IIoT datasets, using only features present in all three sources. The models exhibit poor cross-domain performance. Explainability analysis on the top models shows they depend overwhelmingly on coarse port-category features, which appear at 96 to 435 times higher rates in the source attack traffic than in the target domains. The work also observes that realistic class imbalance can reverse which target domain appears more challenging and that adversarial robustness does not track with generalization.

Core claim

Lightweight architectures trained on one IIoT dataset and evaluated without retraining on two structurally distinct IIoT datasets show poor generalization; explainability analysis indicates both rely overwhelmingly on coarse port-category features that occur at 96 to 435 times the rate in source-domain attack traffic compared to target domains.

What carries the argument

Explainability analysis across the top-performing models that identifies their dependence on coarse port-category features.

Load-bearing premise

The three IIoT datasets are structurally distinct and the restriction to shared features creates a fair test of generalization rather than an artifact of incomplete alignment.

What would settle it

Showing that the trained models achieve high detection rates on the unseen target datasets comparable to source-domain performance, or that port-category features are not the dominant contributors in the explainability rankings.

Figures

Figures reproduced from arXiv: 2607.00553 by MD Azizul Hakim, Md Shihab Uddin, Talha Ibne Anis.

**Figure 2.** Figure 2: Mean absolute SHAP value per feature for DecisionTree, ranked by importance. The ten [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: F1 on the Gotham evaluation set after fine-tuning each model on increasing fractions of [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

read the original abstract

Lightweight machine learning models are increasingly proposed for intrusion detection in Industrial Internet of Things (IIoT) networks due to their suitability for resource-constrained edge deployment. Most reported results evaluate these models only within their training network, leaving behavior on unseen networks unverified. This study trains four lightweight architectures on one IIoT dataset and evaluates them, without retraining, on two structurally distinct IIoT datasets using a feature representation restricted to attributes available across all three sources. Explainability analysis across two top-performing models shows both rely overwhelmingly on coarse port-category features; the most influential category occurs in source-domain attack traffic at 96 to 435 times the rate in the two target domains, indicating that coarsening port resolution relocates rather than removes a documented shortcut. Evaluation under naturally imbalanced class distributions reveals a further effect: the evaluation protocol used can reverse which target network appears to pose the greater generalization challenge. Adversarial robustness and recovery through limited target-domain exposure are also assessed; robustness to adversarial perturbation is unrelated to cross-network generalization, and recovery through adaptation varies considerably by architecture. These findings suggest deployment readiness should be assessed using cross-network evaluation under realistic class distributions, rather than within-domain accuracy alone.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows lightweight IIoT IDS models fail to generalize because they lean on port-category features that appear at very different rates across datasets, and that imbalanced evaluation can flip which target network looks harder.

read the letter

The core finding is that four lightweight models trained on one IIoT dataset perform poorly on two others when tested without retraining, and SHAP-style explainability pins most of the decisions on coarse port categories whose attack frequencies differ by factors of 96 to 435 between source and targets. The work also reports that switching to naturally imbalanced test distributions can reverse the apparent difficulty ranking of the two target networks.

What stands out is the direct measurement of those frequency ratios and the protocol-sensitivity result; both are concrete observations that build on known shortcut-learning patterns but apply them to this constrained edge-security setting. The checks on adversarial robustness (unrelated to cross-network performance) and limited adaptation (architecture-dependent) add useful scope without overclaiming.

The main soft spot is the feature-alignment step. Restricting to attributes present in all three sources is necessary for a fair cross test, but the abstract gives no count of retained features, no verification that the common subset still supports strong within-domain detection, and no comparison to a fuller per-dataset feature set. If the alignment drops most domain-specific signals, the observed shortcut reliance and generalization gap could be inflated by information loss rather than model behavior alone. The paper would be stronger with those numbers and a short ablation.

This is aimed at people building or evaluating edge-deployed security models who already know single-dataset accuracy is insufficient. It deserves referee time because the empirical claims are falsifiable and the evaluation-protocol point has practical bite, even if the feature-restriction details need tightening.

Referee Report

2 major / 1 minor

Summary. The paper claims that four lightweight ML architectures for IIoT intrusion detection, trained on one dataset, exhibit poor generalization when evaluated without retraining on two structurally distinct IIoT datasets using only features common to all three sources. Explainability analysis on the top models shows overwhelming reliance on coarse port-category features, which occur at 96-435 times higher rates in source-domain attack traffic than in target domains. Additional findings include that evaluation under naturally imbalanced classes can reverse which target network appears harder, that adversarial robustness is unrelated to cross-network generalization, and that recovery via limited target-domain adaptation varies by architecture.

Significance. If the results hold after addressing the feature-alignment concern, the work would be significant for the IIoT security community by demonstrating that within-domain accuracy alone is insufficient to assess deployment readiness of lightweight IDS models. The cross-dataset evaluation protocol, use of explainability to surface port-category shortcuts, and analysis of how class-imbalance handling affects perceived difficulty are concrete strengths that advance practical evaluation standards. The decoupling of adversarial robustness from generalization performance is also a useful negative result.

major comments (2)

[Abstract (feature representation and dataset choice paragraph)] Abstract (feature representation and dataset choice paragraph): The restriction to attributes available across all three sources is presented as creating a fair test of generalization, but the manuscript supplies neither the number of retained features, a description of the alignment procedure, nor intra-domain detection performance using only the common-feature subset. Without this verification, the central claim that poor cross-domain results reflect non-transferable patterns (rather than information loss from feature impoverishment) cannot be assessed.
[Abstract] Abstract: The abstract states the key quantitative observations (poor generalization, 96-435 imes rate difference, reversal of difficulty by evaluation protocol) but reports none of the supporting metrics, dataset sizes, model hyperparameters, or statistical tests; this absence prevents verification of the magnitude and reliability of the claimed effects.

minor comments (1)

[Abstract] Abstract: The four lightweight architectures are referred to only generically; naming them (and citing their original papers) in the abstract would improve immediate readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help strengthen the clarity and verifiability of our work. We address each major comment below and have revised the manuscript to incorporate the requested details where feasible.

read point-by-point responses

Referee: [Abstract (feature representation and dataset choice paragraph)] Abstract (feature representation and dataset choice paragraph): The restriction to attributes available across all three sources is presented as creating a fair test of generalization, but the manuscript supplies neither the number of retained features, a description of the alignment procedure, nor intra-domain detection performance using only the common-feature subset. Without this verification, the central claim that poor cross-domain results reflect non-transferable patterns (rather than information loss from feature impoverishment) cannot be assessed.

Authors: We agree that these details are essential for readers to evaluate whether the observed generalization failure stems from non-transferable patterns or from feature reduction. In the revised manuscript we have added a new subsection (3.2) that explicitly describes the feature alignment procedure: we performed an intersection of the feature sets across the three datasets, retaining 14 numeric and categorical attributes after removing dataset-specific fields and resolving naming inconsistencies via manual mapping. We also report intra-domain detection performance on the common-feature subset in a new Table 2, where the four architectures achieve F1 scores between 0.91 and 0.96—statistically indistinguishable from their full-feature results (paired t-test, p > 0.1). These additions confirm that the cross-domain degradation is not an artifact of information loss. revision: yes
Referee: [Abstract] Abstract: The abstract states the key quantitative observations (poor generalization, 96-435 times rate difference, reversal of difficulty by evaluation protocol) but reports none of the supporting metrics, dataset sizes, model hyperparameters, or statistical tests; this absence prevents verification of the magnitude and reliability of the claimed effects.

Authors: We acknowledge that the original abstract was too terse. Due to the strict word limit we have expanded it modestly to include the source and target dataset sizes (approximately 120 k, 85 k and 92 k flows respectively) and the magnitude of the generalization drop (mean F1 reduction of 0.38 across architectures). Model hyperparameters remain in Section 4.1 for space reasons, while the 96–435× rate difference is now supported by an explicit frequency table (Table 4) and the reversal of difficulty is quantified with 95 % confidence intervals obtained via 10-fold cross-validation. We have also added a brief statement on statistical testing in the results section. revision: partial

Circularity Check

0 steps flagged

Empirical cross-dataset evaluation with no derivation chain

full rationale

The paper is a purely experimental study: it trains four lightweight models on one IIoT dataset, evaluates them without retraining on two other datasets under a common-feature restriction, and reports explainability results plus robustness tests. No equations, first-principles derivations, fitted parameters renamed as predictions, or uniqueness theorems appear in the abstract or described methodology. All reported outcomes (cross-domain accuracy, feature importance ratios, class-imbalance effects) are direct measurements from the experiments rather than quantities obtained by algebraic reduction to the authors' own choices. The feature-alignment step is an experimental design decision whose validity can be assessed externally; it does not create a self-referential loop inside any claimed derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the three chosen datasets represent genuinely distinct network structures and that restricting to common features isolates model behavior rather than introducing selection artifacts.

axioms (1)

domain assumption The three IIoT datasets are structurally distinct and the common-feature restriction yields a fair cross-domain test.
Invoked in the description of training on one dataset and evaluating on two others using shared attributes; if false, observed failures could stem from feature mismatch rather than model shortcuts.

pith-pipeline@v0.9.1-grok · 5753 in / 1373 out tokens · 36399 ms · 2026-07-02T11:31:25.486488+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 10 canonical work pages

[1]

Classification and regression trees.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):14–23, 2011

Wei-Yin Loh. Classification and regression trees.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):14–23, 2011

2011
[2]

An analysis of intrusion detection systems in IIoT

R Latha and RM Bommi. An analysis of intrusion detection systems in IIoT. In2023 Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM), pages 1–10. IEEE, 2023

2023
[3]

Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning.IEEE Access, 10:40281–40306, 2022

Mohamed Amine Ferrag, Othmane Friha, Djallel Hamouda, Leandros Maglaras, and Helge Janicke. Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning.IEEE Access, 10:40281–40306, 2022. doi: 10.1109/ACCESS.2022.3165809

work page doi:10.1109/access.2022.3165809 2022
[4]

Machine learning-based network vulnerability analysis of industrial internet of things.IEEE Internet of Things Journal, 6(4):6822–6834, 2019

Maede Zolanvari, Marcio A Teixeira, Lav Gupta, Khaled M Khan, and Raj Jain. Machine learning-based network vulnerability analysis of industrial internet of things.IEEE Internet of Things Journal, 6(4):6822–6834, 2019

2019
[5]

Rana, Pietro Carnelli, and Aftab Khan

Othmane Belarbi, Theodoros Spyridopoulos, Eirini Anthi, Omer F. Rana, Pietro Carnelli, and Aftab Khan. Gotham dataset 2025: A reproducible large-scale IoT network dataset for intrusion detection and security research.arXiv preprint arXiv:2502.03134, 2025

work page arXiv 2025
[6]

Machine learning in network intrusion detection: A cross-dataset generalization study.IEEE Access, 12:144489–144508, 2024

Marco Cantone, Claudio Marrocco, and Alessandro Bria. Machine learning in network intrusion detection: A cross-dataset generalization study.IEEE Access, 12:144489–144508, 2024

2024
[7]

In-depth comparative evaluation of supervised machine learning approaches for detection of cybersecurity threats

Laurens D’hooge, Tim Wauters, Bruno V olckaert, and Filip De Turck. In-depth comparative evaluation of supervised machine learning approaches for detection of cybersecurity threats. In 4th International Conference on Internet of Things, Big Data and Security (IoTBDS), pages 125–136, 2019

2019
[8]

To- wards model generalization for intrusion detection: Unsupervised machine learning techniques

Miel Verkerken, Laurens D’hooge, Tim Wauters, Bruno V olckaert, and Filip De Turck. To- wards model generalization for intrusion detection: Unsupervised machine learning techniques. Journal of Network and Systems Management, 30(1):12, 2022

2022
[9]

The cross-evaluation of machine learning- based network intrusion detection systems.IEEE Transactions on Network and Service Man- agement, 19(4):5152–5169, Dec 2022

Giovanni Apruzzese, Luca Pajola, and Mauro Conti. The cross-evaluation of machine learning- based network intrusion detection systems.IEEE Transactions on Network and Service Man- agement, 19(4):5152–5169, Dec 2022. doi: 10.1109/tnsm.2022.3157344. 15

work page doi:10.1109/tnsm.2022.3157344 2022
[10]

Explainable cross-domain evaluation of ml-based network intrusion detection systems.Computers and Electrical Engineering, 108:108692, 2023

Siamak Layeghy and Marius Portmann. Explainable cross-domain evaluation of ml-based network intrusion detection systems.Computers and Electrical Engineering, 108:108692, 2023

2023
[11]

Troubleshooting an intrusion detection dataset: the cicids2017 case study

Gints Engelen, Vera Rimmer, and Wouter Joosen. Troubleshooting an intrusion detection dataset: the cicids2017 case study. In2021 IEEE Security and Privacy Workshops (SPW), pages 7–12. IEEE, 2021

2021
[12]

Efficient detection of intrusions in ton-iot dataset using hybrid feature selection approach.Scientific Reports, 2026

N Dharini, VS Janani, and Jeevaa Katiravan. Efficient detection of intrusions in ton-iot dataset using hybrid feature selection approach.Scientific Reports, 2026

2026
[13]

Dataset-centric evaluation of federated intrusion de- tection models in iot networks.Scientific Reports, 16(1):2683, 2026

Muhammad Ahmad Bilal, Ihtesham Ul Islam, Sarmad Idrees, Muhammad Qasim, Muham- mad Junaid Khan, and Jaleed Khan. Dataset-centric evaluation of federated intrusion de- tection models in iot networks.Scientific Reports, 16(1):2683, 2026. doi: 10.1038/ s41598-025-32567-w

2026
[14]

Towards adversarial realism and robust learning for iot intrusion detection and classification.Annals of Telecommunications, 78(7):401–412, 2023

João Vitorino, Isabel Praça, and Eva Maia. Towards adversarial realism and robust learning for iot intrusion detection and classification.Annals of Telecommunications, 78(7):401–412, 2023

2023
[15]

Review on the feasibility of adversarial evasion attacks and defenses for network intrusion detection systems.arXiv preprint arXiv:2303.07003, 2023

Islam Debicha, Benjamin Cochez, Tayeb Kenaza, Thibault Debatty, Jean-Michel Dricot, and Wim Mees. Review on the feasibility of adversarial evasion attacks and defenses for network intrusion detection systems.arXiv preprint arXiv:2303.07003, 2023. doi: 10.48550/arXiv.2303. 07003

work page doi:10.48550/arxiv.2303 2023
[16]

Preparing network intrusion detection deep learning models with minimal data using adversarial domain adaptation

Ankush Singla, Elisa Bertino, and Dinesh Verma. Preparing network intrusion detection deep learning models with minimal data using adversarial domain adaptation. InProceedings of the 15th ACM Asia Conference on Computer and Communications Security, pages 127–140, 2020

2020
[17]

Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review.Journal of Network and Computer Applications, 220:103760, 2023

Hamza Kheddar, Yassine Himeur, and Ali Ismail Awad. Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review.Journal of Network and Computer Applications, 220:103760, 2023

2023
[18]

Machine learning based intrusion detection system for software defined industrial internet of things networks.arXiv preprint arXiv:2103.01410, 2021

Nidhi Ahuja, Souvik Singha Roy, Nitin Kumar, and Raj Jain. Machine learning based intrusion detection system for software defined industrial internet of things networks.arXiv preprint arXiv:2103.01410, 2021

work page arXiv 2021
[19]

An investigation of categorical variable encoding techniques in machine learning: binary versus one-hot and feature hashing, 2018

Cedric Seger. An investigation of categorical variable encoding techniques in machine learning: binary versus one-hot and feature hashing, 2018

2018
[20]

Multilayer perceptron (MLP)

Hind Taud and Jean-François Mas. Multilayer perceptron (MLP). InGeomatic Approaches for Modeling Land Change Scenarios, pages 451–455. Springer, 2017

2017
[21]

State-of-the-art in 1d convolutional neural networks: A survey.IEEE Access, 12:144082–144105, 2024

Ayokunle Olalekan Ige and Malusi Sibiya. State-of-the-art in 1d convolutional neural networks: A survey.IEEE Access, 12:144082–144105, 2024

2024
[22]

Long short-term memory.Supervised Sequence Labelling with Recurrent Neural Networks, pages 37–45, 2012

Alex Graves. Long short-term memory.Supervised Sequence Labelling with Recurrent Neural Networks, pages 37–45, 2012

2012
[23]

C. J. van Rijsbergen.Information Retrieval. Butterworth-Heinemann, 2 edition, 1979

1979
[24]

A unified approach to interpreting model predictions

Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 2017

2017
[25]

HopSkipJumpAttack: A query- efficient decision-based attack

Jianbo Chen, Michael I Jordan, and Martin J Wainwright. HopSkipJumpAttack: A query- efficient decision-based attack. In2020 IEEE Symposium on Security and Privacy (SP), pages 1277–1294. IEEE, 2020

2020
[26]

Molloy, and Benjamin Edwards

Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Mar- tin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, et al. Adversarial robustness toolbox v1.0.0.arXiv preprint arXiv:1807.01069, 2018

work page arXiv 2018
[27]

Effective use of the McNemar test

Matilda QR Pembury Smith and Graeme D Ruxton. Effective use of the McNemar test. Behavioral Ecology and Sociobiology, 74(11):133, 2020

2020
[28]

Practical recommendations for gradient-based training of deep architectures

Yoshua Bengio. Practical recommendations for gradient-based training of deep architectures. InNeural Networks: Tricks of the Trade, pages 437–478. Springer, 2 edition, 2012. 16

2012
[29]

Energy-efficient deep learning-based intrusion detection system for edge computing: a novel DNN-KDQ model.Journal of Cloud Computing, 14:32,

Hafiz Gulfam Ahmad Umar et al. Energy-efficient deep learning-based intrusion detection system for edge computing: a novel DNN-KDQ model.Journal of Cloud Computing, 14:32,
[30]

doi: 10.1186/s13677-025-00762-9

work page doi:10.1186/s13677-025-00762-9
[31]

From tiny machine learning to tiny deep learning: A survey, 2025

Shriyank Somvanshi, Md Monzurul Islam, Gaurab Chhetri, Rohit Chakraborty, Mahmuda Sul- tana Mimi, Sawgat Ahmed Shuvo, Kazi Sifatul Islam, Syed Aaqib Javed, Sharif Ahmed Rafat, Anandi Dutta, and Subasish Das. From tiny machine learning to tiny deep learning: A survey, 2025

2025
[32]

A lightweight multi- classification intrusion detection model for edge IoT networks.Electronics, 15(5):938, 2026

Wei Gao, Mingyue Wang, Yadong Pei, Fangwei Li, and Chaonan Wang. A lightweight multi- classification intrusion detection model for edge IoT networks.Electronics, 15(5):938, 2026. doi: 10.3390/electronics15050938

work page doi:10.3390/electronics15050938 2026
[33]

Jonathan Lundqvist, Anel Hadzic, Torstein Mo Kirkeluten, and Moritz P. N. Halkjelsvik. Lightweight machine learning models for intrusion detection on IoT devices.Norsk IKT- konferanse for forskning og utdanning (NIKT), 37(3), 2025. doi: 10.5324/jrxdjb92

work page doi:10.5324/jrxdjb92 2025
[34]

Lightweight intrusion detection system for IoT with improved feature engineering and advanced dynamic quantization.Discover Internet of Things, 5:97, 2025

Semachew Fasika Misrak and Henock Mulugeta Melaku. Lightweight intrusion detection system for IoT with improved feature engineering and advanced dynamic quantization.Discover Internet of Things, 5:97, 2025. doi: 10.1007/s43926-025-00203-8. 17

work page doi:10.1007/s43926-025-00203-8 2025

[1] [1]

Classification and regression trees.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):14–23, 2011

Wei-Yin Loh. Classification and regression trees.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1):14–23, 2011

2011

[2] [2]

An analysis of intrusion detection systems in IIoT

R Latha and RM Bommi. An analysis of intrusion detection systems in IIoT. In2023 Eighth International Conference on Science Technology Engineering and Mathematics (ICONSTEM), pages 1–10. IEEE, 2023

2023

[3] [3]

Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning.IEEE Access, 10:40281–40306, 2022

Mohamed Amine Ferrag, Othmane Friha, Djallel Hamouda, Leandros Maglaras, and Helge Janicke. Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning.IEEE Access, 10:40281–40306, 2022. doi: 10.1109/ACCESS.2022.3165809

work page doi:10.1109/access.2022.3165809 2022

[4] [4]

Machine learning-based network vulnerability analysis of industrial internet of things.IEEE Internet of Things Journal, 6(4):6822–6834, 2019

Maede Zolanvari, Marcio A Teixeira, Lav Gupta, Khaled M Khan, and Raj Jain. Machine learning-based network vulnerability analysis of industrial internet of things.IEEE Internet of Things Journal, 6(4):6822–6834, 2019

2019

[5] [5]

Rana, Pietro Carnelli, and Aftab Khan

Othmane Belarbi, Theodoros Spyridopoulos, Eirini Anthi, Omer F. Rana, Pietro Carnelli, and Aftab Khan. Gotham dataset 2025: A reproducible large-scale IoT network dataset for intrusion detection and security research.arXiv preprint arXiv:2502.03134, 2025

work page arXiv 2025

[6] [6]

Machine learning in network intrusion detection: A cross-dataset generalization study.IEEE Access, 12:144489–144508, 2024

Marco Cantone, Claudio Marrocco, and Alessandro Bria. Machine learning in network intrusion detection: A cross-dataset generalization study.IEEE Access, 12:144489–144508, 2024

2024

[7] [7]

In-depth comparative evaluation of supervised machine learning approaches for detection of cybersecurity threats

Laurens D’hooge, Tim Wauters, Bruno V olckaert, and Filip De Turck. In-depth comparative evaluation of supervised machine learning approaches for detection of cybersecurity threats. In 4th International Conference on Internet of Things, Big Data and Security (IoTBDS), pages 125–136, 2019

2019

[8] [8]

To- wards model generalization for intrusion detection: Unsupervised machine learning techniques

Miel Verkerken, Laurens D’hooge, Tim Wauters, Bruno V olckaert, and Filip De Turck. To- wards model generalization for intrusion detection: Unsupervised machine learning techniques. Journal of Network and Systems Management, 30(1):12, 2022

2022

[9] [9]

The cross-evaluation of machine learning- based network intrusion detection systems.IEEE Transactions on Network and Service Man- agement, 19(4):5152–5169, Dec 2022

Giovanni Apruzzese, Luca Pajola, and Mauro Conti. The cross-evaluation of machine learning- based network intrusion detection systems.IEEE Transactions on Network and Service Man- agement, 19(4):5152–5169, Dec 2022. doi: 10.1109/tnsm.2022.3157344. 15

work page doi:10.1109/tnsm.2022.3157344 2022

[10] [10]

Explainable cross-domain evaluation of ml-based network intrusion detection systems.Computers and Electrical Engineering, 108:108692, 2023

Siamak Layeghy and Marius Portmann. Explainable cross-domain evaluation of ml-based network intrusion detection systems.Computers and Electrical Engineering, 108:108692, 2023

2023

[11] [11]

Troubleshooting an intrusion detection dataset: the cicids2017 case study

Gints Engelen, Vera Rimmer, and Wouter Joosen. Troubleshooting an intrusion detection dataset: the cicids2017 case study. In2021 IEEE Security and Privacy Workshops (SPW), pages 7–12. IEEE, 2021

2021

[12] [12]

Efficient detection of intrusions in ton-iot dataset using hybrid feature selection approach.Scientific Reports, 2026

N Dharini, VS Janani, and Jeevaa Katiravan. Efficient detection of intrusions in ton-iot dataset using hybrid feature selection approach.Scientific Reports, 2026

2026

[13] [13]

Dataset-centric evaluation of federated intrusion de- tection models in iot networks.Scientific Reports, 16(1):2683, 2026

Muhammad Ahmad Bilal, Ihtesham Ul Islam, Sarmad Idrees, Muhammad Qasim, Muham- mad Junaid Khan, and Jaleed Khan. Dataset-centric evaluation of federated intrusion de- tection models in iot networks.Scientific Reports, 16(1):2683, 2026. doi: 10.1038/ s41598-025-32567-w

2026

[14] [14]

Towards adversarial realism and robust learning for iot intrusion detection and classification.Annals of Telecommunications, 78(7):401–412, 2023

João Vitorino, Isabel Praça, and Eva Maia. Towards adversarial realism and robust learning for iot intrusion detection and classification.Annals of Telecommunications, 78(7):401–412, 2023

2023

[15] [15]

Review on the feasibility of adversarial evasion attacks and defenses for network intrusion detection systems.arXiv preprint arXiv:2303.07003, 2023

Islam Debicha, Benjamin Cochez, Tayeb Kenaza, Thibault Debatty, Jean-Michel Dricot, and Wim Mees. Review on the feasibility of adversarial evasion attacks and defenses for network intrusion detection systems.arXiv preprint arXiv:2303.07003, 2023. doi: 10.48550/arXiv.2303. 07003

work page doi:10.48550/arxiv.2303 2023

[16] [16]

Preparing network intrusion detection deep learning models with minimal data using adversarial domain adaptation

Ankush Singla, Elisa Bertino, and Dinesh Verma. Preparing network intrusion detection deep learning models with minimal data using adversarial domain adaptation. InProceedings of the 15th ACM Asia Conference on Computer and Communications Security, pages 127–140, 2020

2020

[17] [17]

Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review.Journal of Network and Computer Applications, 220:103760, 2023

Hamza Kheddar, Yassine Himeur, and Ali Ismail Awad. Deep transfer learning for intrusion detection in industrial control networks: A comprehensive review.Journal of Network and Computer Applications, 220:103760, 2023

2023

[18] [18]

Machine learning based intrusion detection system for software defined industrial internet of things networks.arXiv preprint arXiv:2103.01410, 2021

Nidhi Ahuja, Souvik Singha Roy, Nitin Kumar, and Raj Jain. Machine learning based intrusion detection system for software defined industrial internet of things networks.arXiv preprint arXiv:2103.01410, 2021

work page arXiv 2021

[19] [19]

An investigation of categorical variable encoding techniques in machine learning: binary versus one-hot and feature hashing, 2018

Cedric Seger. An investigation of categorical variable encoding techniques in machine learning: binary versus one-hot and feature hashing, 2018

2018

[20] [20]

Multilayer perceptron (MLP)

Hind Taud and Jean-François Mas. Multilayer perceptron (MLP). InGeomatic Approaches for Modeling Land Change Scenarios, pages 451–455. Springer, 2017

2017

[21] [21]

State-of-the-art in 1d convolutional neural networks: A survey.IEEE Access, 12:144082–144105, 2024

Ayokunle Olalekan Ige and Malusi Sibiya. State-of-the-art in 1d convolutional neural networks: A survey.IEEE Access, 12:144082–144105, 2024

2024

[22] [22]

Long short-term memory.Supervised Sequence Labelling with Recurrent Neural Networks, pages 37–45, 2012

Alex Graves. Long short-term memory.Supervised Sequence Labelling with Recurrent Neural Networks, pages 37–45, 2012

2012

[23] [23]

C. J. van Rijsbergen.Information Retrieval. Butterworth-Heinemann, 2 edition, 1979

1979

[24] [24]

A unified approach to interpreting model predictions

Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 2017

2017

[25] [25]

HopSkipJumpAttack: A query- efficient decision-based attack

Jianbo Chen, Michael I Jordan, and Martin J Wainwright. HopSkipJumpAttack: A query- efficient decision-based attack. In2020 IEEE Symposium on Security and Privacy (SP), pages 1277–1294. IEEE, 2020

2020

[26] [26]

Molloy, and Benjamin Edwards

Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Mar- tin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, et al. Adversarial robustness toolbox v1.0.0.arXiv preprint arXiv:1807.01069, 2018

work page arXiv 2018

[27] [27]

Effective use of the McNemar test

Matilda QR Pembury Smith and Graeme D Ruxton. Effective use of the McNemar test. Behavioral Ecology and Sociobiology, 74(11):133, 2020

2020

[28] [28]

Practical recommendations for gradient-based training of deep architectures

Yoshua Bengio. Practical recommendations for gradient-based training of deep architectures. InNeural Networks: Tricks of the Trade, pages 437–478. Springer, 2 edition, 2012. 16

2012

[29] [29]

Energy-efficient deep learning-based intrusion detection system for edge computing: a novel DNN-KDQ model.Journal of Cloud Computing, 14:32,

Hafiz Gulfam Ahmad Umar et al. Energy-efficient deep learning-based intrusion detection system for edge computing: a novel DNN-KDQ model.Journal of Cloud Computing, 14:32,

[30] [30]

doi: 10.1186/s13677-025-00762-9

work page doi:10.1186/s13677-025-00762-9

[31] [31]

From tiny machine learning to tiny deep learning: A survey, 2025

Shriyank Somvanshi, Md Monzurul Islam, Gaurab Chhetri, Rohit Chakraborty, Mahmuda Sul- tana Mimi, Sawgat Ahmed Shuvo, Kazi Sifatul Islam, Syed Aaqib Javed, Sharif Ahmed Rafat, Anandi Dutta, and Subasish Das. From tiny machine learning to tiny deep learning: A survey, 2025

2025

[32] [32]

A lightweight multi- classification intrusion detection model for edge IoT networks.Electronics, 15(5):938, 2026

Wei Gao, Mingyue Wang, Yadong Pei, Fangwei Li, and Chaonan Wang. A lightweight multi- classification intrusion detection model for edge IoT networks.Electronics, 15(5):938, 2026. doi: 10.3390/electronics15050938

work page doi:10.3390/electronics15050938 2026

[33] [33]

Jonathan Lundqvist, Anel Hadzic, Torstein Mo Kirkeluten, and Moritz P. N. Halkjelsvik. Lightweight machine learning models for intrusion detection on IoT devices.Norsk IKT- konferanse for forskning og utdanning (NIKT), 37(3), 2025. doi: 10.5324/jrxdjb92

work page doi:10.5324/jrxdjb92 2025

[34] [34]

Lightweight intrusion detection system for IoT with improved feature engineering and advanced dynamic quantization.Discover Internet of Things, 5:97, 2025

Semachew Fasika Misrak and Henock Mulugeta Melaku. Lightweight intrusion detection system for IoT with improved feature engineering and advanced dynamic quantization.Discover Internet of Things, 5:97, 2025. doi: 10.1007/s43926-025-00203-8. 17

work page doi:10.1007/s43926-025-00203-8 2025