pith. sign in

arxiv: 2607.02209 · v1 · pith:MTVXPA6Pnew · submitted 2026-07-02 · 💻 cs.CV

MedSaab-US: A Backpropagation-Free Multi-Scale Wavelet-Saab Framework for Thyroid Nodule Segmentation in Ultrasound Images

Pith reviewed 2026-07-03 15:57 UTC · model grok-4.3

classification 💻 cs.CV
keywords thyroid nodule segmentationultrasound imagesbackpropagation-freeSaab transformdiscrete wavelet transformXGBoostgreen learningTN3K dataset
0
0 comments X

The pith

MedSaab-US segments thyroid nodules in ultrasound images with a backpropagation-free pipeline of multi-level wavelet and multi-scale Saab transforms fed to XGBoost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MedSaab-US as a non-deep-learning alternative for thyroid nodule segmentation in ultrasound that avoids backpropagation and GPUs. It extracts features by first applying multi-level discrete wavelet transform and then multi-scale Saab transforms at three patch sizes, followed by label-assisted greedy selection of the most useful features. These feed an XGBoost classifier that performs pixel-wise prediction, with all parameters set analytically or through greedy tree building. The method is evaluated on the TN3K dataset of 2879 training and 614 test images, reaching a mean Dice of 0.4784 while keeping the model under 500K parameters and running in 0.3 seconds on CPU. The authors position it as an exploratory baseline and examine the specific difficulties posed by isoechoic nodules.

Core claim

MedSaab-US extracts multi-scale spatial-frequency features by combining multi-level Discrete Wavelet Transform (DWT) with multi-scale channel-wise Saab transforms at patch sizes of 5x5, 11x11, and 21x21 pixels. Label-Assisted Greedy (LAG) feature selection retains the most discriminative features, which are fed to an XGBoost classifier for pixel-wise prediction. The Saab transform parameters are determined analytically from data statistics, while XGBoost employs iterative greedy tree construction without requiring backpropagation. On the TN3K dataset it achieves a mean Dice coefficient of 0.4784 +/- 0.2190, precision of 0.5768, and recall of 0.5604, with a model footprint under 500K paramete

What carries the argument

The multi-level DWT followed by multi-scale Saab transforms at 5x5, 11x11 and 21x21 patches, with LAG feature selection supplying the input to the XGBoost pixel classifier.

If this is right

  • The small parameter count and CPU inference speed enable deployment on resource-limited clinical hardware without GPUs.
  • All parameters being set analytically or by greedy methods yields a mathematically more tractable pipeline than backpropagation-based networks.
  • Ablation results quantify how much LAG selection and training-set size each contribute to the final Dice score.
  • The framework supplies a reproducible non-DL baseline against which future ultrasound segmentation methods can be compared.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same analytical feature pipeline could be adapted to other ultrasound tasks such as lesion detection where model size and interpretability matter.
  • The reported difficulty with isoechoic nodules points to a possible need for additional scale-specific or texture-specific Saab stages tuned to low-contrast boundaries.
  • Because the method decouples feature extraction from the classifier, swapping XGBoost for another lightweight model could be tested without retraining the entire front end.

Load-bearing premise

The particular combination of multi-level DWT and multi-scale Saab transforms plus LAG selection will produce features that let XGBoost generate accurate pixel-wise nodule masks, including for isoechoic nodules.

What would settle it

Running the full pipeline on a new ultrasound test set dominated by isoechoic nodules and observing whether the mean Dice falls well below 0.4 would directly test the claim.

Figures

Figures reproduced from arXiv: 2607.02209 by Mohammad Amanour Rahman.

Figure 1
Figure 1. Figure 1: MedSaab-US pipeline. Input grayscale US image is decomposed by 2-level DWT (Stage 1). Seven subbands are [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
read the original abstract

Deep learning (DL) methods dominate thyroid nodule segmentation in ultrasound (US) images, achieving high Dice scores but at the cost of millions of parameters, GPU-dependent training via backpropagation, and limited mathematical tractability. These limitations impede deployment in resource-constrained environments. In this paper, we propose MedSaab-US, a backpropagation-free segmentation framework grounded in the Green Learning paradigm. MedSaab-US extracts multi-scale spatial-frequency features by combining multi-level Discrete Wavelet Transform (DWT) with multi-scale channel-wise Saab (Subspace Approximation with Adjusted Bias) transforms at patch sizes of 5 x 5, 11 x 11, and 21 x 21 pixels. Label-Assisted Greedy (LAG) feature selection retains the most discriminative features, which are fed to an XGBoost classifier for pixel-wise prediction. The Saab transform parameters are determined analytically from data statistics, while XGBoost employs iterative greedy tree construction without requiring backpropagation. Evaluated on the TN3K dataset (2,879 training and 614 test images), MedSaab-US achieves a mean Dice coefficient of 0.4784 +/- 0.2190, precision of 0.5768, and recall of 0.5604, with a model footprint under 500K parameters and CPU-only inference in approximately 0.3 seconds per image. We present this result as an exploratory non-DL baseline for thyroid ultrasound segmentation and analyze the specific challenges posed by isoechoic nodules. An ablation study further quantifies the contribution of each pipeline component, including separate evaluations of LAG feature selection and training-set size.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper proposes MedSaab-US, a backpropagation-free segmentation framework for thyroid nodules in ultrasound images that combines multi-level Discrete Wavelet Transform with multi-scale Saab transforms (at 5x5, 11x11, and 21x21 patches), applies Label-Assisted Greedy (LAG) feature selection, and uses an XGBoost classifier for pixel-wise predictions. Saab parameters are derived analytically from data statistics. On the TN3K dataset (2879 train, 614 test images), it reports mean Dice 0.4784 ± 0.2190, precision 0.5768, recall 0.5604, with <500K parameters and ~0.3s CPU inference per image, explicitly framed as an exploratory non-DL baseline with analysis of isoechoic nodule challenges and an ablation study on pipeline components.

Significance. If the reported metrics hold, the work provides a mathematically tractable, low-footprint, CPU-only baseline that avoids backpropagation and GPU requirements. This could be useful for resource-constrained clinical settings and for establishing non-DL reference points in thyroid US segmentation research. The analytical Saab construction, LAG selection, and explicit ablation study on component contributions are positive aspects that enhance reproducibility and interpretability.

minor comments (2)
  1. The abstract states the method 'analyzes the specific challenges posed by isoechoic nodules' but does not quantify how many test cases fall into this category or report separate metrics; adding this breakdown would strengthen the baseline presentation without altering the central claim.
  2. The ablation study is described as quantifying contributions of each component (including LAG and training-set size) but no table or figure reference is provided in the abstract; ensure the full manuscript includes a clear tabular summary of these results.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive evaluation of MedSaab-US as a useful non-DL baseline and for recommending minor revision. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents MedSaab-US as an exploratory non-DL baseline using multi-level DWT, multi-scale Saab transforms (parameters set analytically from data statistics), LAG selection, and standard XGBoost greedy trees for pixel-wise segmentation on TN3K. No load-bearing step reduces by the paper's equations or self-citation to its own inputs; the Dice/precision/recall figures are empirical outputs of the pipeline rather than tautological fits, and the method is deliberately positioned as a low-parameter CPU baseline without invoking uniqueness theorems or ansatzes from prior self-work.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Abstract-only review limits visibility into exact assumptions; no new entities postulated; relies on standard properties of DWT and analytical Saab.

free parameters (1)
  • patch sizes = 5x5, 11x11, 21x21
    Chosen values 5x5, 11x11, 21x21 for multi-scale extraction; not derived from first principles in abstract
axioms (2)
  • standard math Discrete Wavelet Transform provides a multi-scale spatial-frequency decomposition of images
    Invoked as the first stage of feature extraction
  • domain assumption Saab transform parameters are fully determined by data statistics without iterative optimization
    Central to the backpropagation-free claim

pith-pipeline@v0.9.1-grok · 5835 in / 1521 out tokens · 30437 ms · 2026-07-03T15:57:21.835039+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

  1. [1]

    TN3K: A Thyroid Nodule Three-view Knowledge Dataset for Ultrasound Image Segmentation,

    X. Gong, S. Liu, F. Zhou, and C. Wang, “TN3K: A Thyroid Nodule Three-view Knowledge Dataset for Ultrasound Image Segmentation,” Medical Image Analysis, 2022

  2. [2]

    U-Net: Convolutional Networks for Biomedical Image Segmentation,

    O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” inProc. MICCAI, 2015, pp. 234– 241

  3. [3]

    TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation,

    Y . Zhang, H. Liu, and Q. Hu, “TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation,” inProc. MICCAI, 2021

  4. [4]

    TRFE-Net: Two- Stream Residual Feature Enhancement Network for Thyroid Nodule Segmentation,

    Y . Wu, Y . Xia, Y . Song, D. Zhang, and W. Cai, “TRFE-Net: Two- Stream Residual Feature Enhancement Network for Thyroid Nodule Segmentation,”IEEE J. Biomed. Health Inform., vol. 24, no. 11, pp. 3092– 3104, 2020

  5. [5]

    SwinE-Net: Hybrid Deep Learning Approach to Novel Thyroid Nodule Segmentation,

    A. Suet al., “SwinE-Net: Hybrid Deep Learning Approach to Novel Thyroid Nodule Segmentation,”J. Comput. Design Eng., vol. 10, no. 1, pp. 116–135, 2023

  6. [6]

    MADGNet: Multi-Scale Aligned Dual-Branch Guidance Network for Thyroid Nodule Segmentation in Ultrasound Images,

    S. Liu, H. Hu, L. Zhang, and X. Gong, “MADGNet: Multi-Scale Aligned Dual-Branch Guidance Network for Thyroid Nodule Segmentation in Ultrasound Images,”Comput. Biol. Med., vol. 169, p. 107874, 2024

  7. [7]

    Thyroid Nodule Segmentation Using DeepLabv3+ with ECB, CMM, and SSEM Modules,

    A. Rahman, “Thyroid Nodule Segmentation Using DeepLabv3+ with ECB, CMM, and SSEM Modules,”IEEE Access, vol. 13, 2025

  8. [8]

    Green Learning: Introduction, Examples and Outlook,

    C.-C. J. Kuo and Y . Chen, “Green Learning: Introduction, Examples and Outlook,”J. Visual Commun. Image Represent., vol. 90, p. 103729, 2023

  9. [9]

    Interpretable Convolutional Neural Networks via Feedforward Design,

    C.-C. J. Kuo, M. Zhang, S. Li, J. Duan, and Y . Chen, “Interpretable Convolutional Neural Networks via Feedforward Design,”J. Visual Commun. Image Represent., vol. 60, pp. 346–359, 2019

  10. [10]

    PixelHop: A Successive Subspace Learning (SSL) Method for Object Recognition,

    Y . Chen, H. Liu, P. J. Zhang, C. Rozi `ere, and C.-C. J. Kuo, “PixelHop: A Successive Subspace Learning (SSL) Method for Object Recognition,” J. Visual Commun. Image Represent., vol. 70, p. 102749, 2020

  11. [11]

    V oxelHop: Successive Subspace Learning for ALS Disease Classification Using Structural MRI,

    B. Zhanget al., “V oxelHop: Successive Subspace Learning for ALS Disease Classification Using Structural MRI,”IEEE J. Biomed. Health Inform., vol. 26, no. 1, pp. 97–107, 2022

  12. [12]

    RadHop: A Green, Lightweight, and Interpretable Method for Prostate Cancer Grading,

    Y . Zhang, S. Tan, T. Feng, C.-C. J. Kuo, and B. J. Tromberg, “RadHop: A Green, Lightweight, and Interpretable Method for Prostate Cancer Grading,” inProc. ISBI, 2022

  13. [13]

    XGBoost: A Scalable Tree Boosting System,

    T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” inProc. KDD, 2016, pp. 785–794

  14. [14]

    A Theory for Multiresolution Signal Decomposition: The Wavelet Representation,

    S. G. Mallat, “A Theory for Multiresolution Signal Decomposition: The Wavelet Representation,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 11, no. 7, pp. 674–693, 1989