pith. sign in

arxiv: 2607.01089 · v1 · pith:34LFM5LNnew · submitted 2026-07-01 · 📡 eess.IV · cs.LG

Group-invariant Coresets for Data-efficient Active Learning

Pith reviewed 2026-07-02 04:04 UTC · model grok-4.3

classification 📡 eess.IV cs.LG
keywords group-invariant coresetactive learningquotient spaceorbit coveragelabel efficiencytransformation groupinvariant embeddingsgeneralization bound
0
0 comments X

The pith

A group-invariant coreset method selects samples by their orbits under known transformations to avoid querying redundant symmetric copies in active learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes that incorporating known data symmetries into coreset selection for active learning allows selection to operate on orbits rather than individual samples. This is done by working in the quotient space using either canonical forms or invariant embeddings. If true, this would mean fewer labels are needed to achieve good coverage when symmetries create many equivalent versions of the same data point. Standard coreset methods waste budget on transformed duplicates, while this approach combines quotient k-center selection with orbit-averaged loss during training. Experiments on scale-invariant synthetic data and rotated images support improved efficiency.

Core claim

GRINCO performs acquisition in the quotient space induced by a transformation group so that selection operates on orbits rather than raw samples. It uses canonical representatives or learned orbit-separating invariant embeddings to define quotient metrics, combines this with invariant training through an orbit-averaged loss, and derives a generalization bound relating excess orbit-averaged risk to quotient-space coverage, label uncertainty, and intra-orbit variability.

What carries the argument

GRINCO, the group-invariant coreset framework that performs acquisition in the quotient space induced by a transformation group.

If this is right

  • GRINCO improves orbit coverage compared to conventional coreset baselines.
  • It achieves stronger label efficiency especially when group-induced redundancy is substantial.
  • The generalization bound connects excess risk to how well the quotient space is covered.
  • Performance gains appear on both synthetic scale-invariant data and image benchmarks with rotations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar quotient methods could extend to other data types with known symmetries like translations or reflections.
  • Learned invariant embeddings might allow the approach even when the group is only partially known.
  • Reducing intra-orbit variability through the averaged loss could improve model robustness beyond label savings.
  • Testing on sequential data with time-shift groups would check if the efficiency gains hold in other modalities.

Load-bearing premise

The transformation group must be known in advance and must create substantial redundancy that can be removed without discarding information needed for the learning task.

What would settle it

Running the method on image data with known rotations and finding that it requires as many or more labels as standard coresets to reach the same accuracy would show the claim does not hold.

Figures

Figures reproduced from arXiv: 2607.01089 by J. C. M. Bermudez, L. C. Ayres, R. A. Borsoi, S. J. M. de Almeida.

Figure 1
Figure 1. Figure 1: Illustrative example. In the input space [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visualizing the quotient mapping. Left: The input space [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the group-invariant coreset and AL pipeline. A. Group-invariant Coresets 1) Quotient space representation: Following the group￾theoretic view of invariance, we therefore work on the quotient space X {G of the data under G and formulate selection directly in the resulting quotient geometry. This way, orbits serve as natural summaries under invariance assumptions, which motivates operating on equ… view at source ↗
Figure 5
Figure 5. Figure 5: shows orbit efficiency ηpBq as a function of budget, averaged over 30 Monte Carlo runs. As expected, GRINCO achieves ηpBq “ 1 for B ď 4, with one representative per orbit, while random yields redundant selections even at very low budgets. The Euclidean coreset baseline achieves intermediate performance at B “ 3, 4. For B ą 4, orbit efficiency decreases for all methods as the budget exceeds the number of di… view at source ↗
Figure 4
Figure 4. Figure 4: Selected coresets for the rays dataset at budget B “ 4 using random, Euclidean coreset, and GRINCO. Points are colored by class, and marker type indicates the selected samples. Class-wise counts are random p2, 0, 0, 2q, Euclidean coreset p2, 1, 0, 1q, and GRINCO p1, 1, 1, 1q for classes pA, B, C, Dq. ray while missing some rays entirely. The Euclidean coreset baseline improves coverage, but it can still se… view at source ↗
Figure 6
Figure 6. Figure 6: Labeling-efficiency results on rotated CIFAR-10 ( [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Full active-learning trajectories on rotated CIFAR-10. (a) [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
read the original abstract

Active learning reduces labeling cost by querying the most informative unlabeled samples, but standard coreset methods ignore known data symmetries and can waste budget on transformed versions of the same instance. We propose GRINCO, a group-invariant coreset framework that performs acquisition in the quotient space induced by a transformation group, so that selection operates on orbits rather than raw samples. The method uses either canonical representatives or learned orbit-separating invariant embeddings to define practical quotient metrics, and combines quotient-space k-center selection with invariant training through an orbit-averaged loss. We further derive a generalization bound that relates excess orbit-averaged risk to quotient-space coverage, label uncertainty, and intra-orbit variability. Experiments on synthetic scale-invariant data and image benchmarks with rotation-induced redundancy show that GRINCO improves orbit coverage and achieves stronger label efficiency than conventional coreset baselines, especially when group-induced redundancy is substantial.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The provided abstract and description contain no equations, fitted parameters presented as predictions, or self-citation chains that reduce the central claims (quotient-space coreset selection, orbit-averaged loss, or generalization bound) to inputs by construction. The bound is stated as relating excess orbit-averaged risk to coverage, uncertainty, and intra-orbit variability without visible self-referential fitting. No load-bearing self-citations or ansatzes smuggled via prior work are quoted. This is the common case of a self-contained extension with independent experimental support.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes a known group action and the existence of practical quotient metrics, but these are not quantified.

pith-pipeline@v0.9.1-grok · 5695 in / 1172 out tokens · 31813 ms · 2026-07-02T04:04:13.700623+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

59 extracted references · 8 canonical work pages · 2 internal anchors

  1. [1]

    Learning from multiple annotators for medical image segmentation,

    L. Zhang, R. Tanno, M. Xu, Y . Huang, K. Bronik, C. Jin, J. Jacob, Y . Zheng, L. Shao, O. Ciccarelliet al., “Learning from multiple annotators for medical image segmentation,”Pattern Recognition, vol. 138, p. 109400, 2023

  2. [2]

    Label-efficient learning in agriculture: A comprehensive review,

    J. Li, D. Chen, X. Qi, Z. Li, Y . Huang, D. Morris, and X. Tan, “Label-efficient learning in agriculture: A comprehensive review,”Computers and Electronics in Agriculture, vol. 215, p. 108412, 2023

  3. [3]

    A comprehensive review: Active learning for hyper- spectral image classifications,

    U. Patel and V . Patel, “A comprehensive review: Active learning for hyper- spectral image classifications,”Earth Science Informatics, vol. 16, no. 3, pp. 1975–1991, 2023

  4. [4]

    A survey on self-supervised learning: Algorithms, applications, and future trends,

    J. Gui, T. Chen, J. Zhang, Q. Cao, Z. Sun, H. Luo, and D. Tao, “A survey on self-supervised learning: Algorithms, applications, and future trends,”IEEE Trans. Patt. Anal. Mach. Intell., 2024

  5. [5]

    Hslabeling: Towards efficient labeling for large-scale remote sensing image segmentation with hybrid sparse labeling,

    J. Lin, Z. Yang, Q. Liu, Y . Yan, P. Ghamisi, W. Xie, and L. Fang, “Hslabeling: Towards efficient labeling for large-scale remote sensing image segmentation with hybrid sparse labeling,”IEEE Transactions on Image Processing, 2025

  6. [6]

    Active learning literature survey,

    B. Settles, “Active learning literature survey,” University of Wisconsin– Madison, Computer Sciences Technical Report 1648, 2009

  7. [7]

    A survey on deep active learning: Recent advances and new frontiers,

    D. Li, Z. Wang, Y . Chen, R. Jiang, W. Ding, and M. Okumura, “A survey on deep active learning: Recent advances and new frontiers,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 4, pp. 5879–5899, 2024. PREPRINT 14

  8. [8]

    Geometric approxi- mation via coresets,

    P. K. Agarwal, S. Har-Peled, K. R. Varadarajanet al., “Geometric approxi- mation via coresets,”Combinatorial and Computational Geometry, vol. 52, no. 1, pp. 1–30, 2005

  9. [9]

    Active learning for convolutional neural networks: A core-set approach,

    O. Sener and S. Savarese, “Active learning for convolutional neural networks: A core-set approach,” inInt. Conf. on Learning Representations (ICLR), 2018

  10. [10]

    A coreset selection of coreset selection literature: Introduction and recent advances,

    B. B. Moser, A. S. Shanbhag, S. Frolov, F. Raue, J. Folz, and A. Dengel, “A coreset selection of coreset selection literature: Introduction and recent advances,”arXiv preprint arXiv:2505.17799, 2025

  11. [11]

    In defense of core-set: A density-aware core-set selec- tion for active learning,

    Y . Kim and B. Shin, “In defense of core-set: A density-aware core-set selec- tion for active learning,” inProc. 28th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD), 2022, pp. 804–812

  12. [12]

    Active learning through a covering lens,

    O. Yehuda, A. Dekel, G. Hacohen, and D. Weinshall, “Active learning through a covering lens,” inAdvances in Neural Information Processing Systems 35 (NeurIPS 2022), 2022

  13. [13]

    Generalized coverage for more robust low-budget active learning,

    W. Bae, J. Noh, and D. J. Sutherland, “Generalized coverage for more robust low-budget active learning,” inComputer Vision – ECCV 2024, ser. Lecture Notes in Computer Science, vol. 15141. Springer, 2024, pp. 318–334

  14. [14]

    Fair wasserstein coresets,

    Z. Xiong, N. Dalmasso, S. Sharma, F. Lecue, D. Magazzeni, V . K. Potluru, T. Balch, and M. Veloso, “Fair wasserstein coresets,”arXiv preprint arXiv:2311.05436, 2024

  15. [15]

    Geometric me- dian matching for robust k-subset selection from noisy data,

    A. Acharya, S. Sanghavi, A. G. Dimakis, and I. S. Dhillon, “Geometric me- dian matching for robust k-subset selection from noisy data,”arXiv preprint arXiv:2504.00564, 2025

  16. [16]

    Small coresets via negative dependence: Dpps, linear statistics, and concentration,

    R. Bardenet, S. Ghosh, H. Simon-Onfroy, and H.-S. Tran, “Small coresets via negative dependence: Dpps, linear statistics, and concentration,”arXiv preprint arXiv:2411.00611, 2024

  17. [17]

    Equivariant and coordinate independent convolutional networks: A gauge field theory of neural networks,

    M. Weiler, “Equivariant and coordinate independent convolutional networks: A gauge field theory of neural networks,” Ph.D. dissertation, University of Amsterdam, Mar. 2024, phD thesis

  18. [18]

    Group equivariant convolutional networks,

    T. S. Cohen and M. Welling, “Group equivariant convolutional networks,” in Proc. of the 33rd Int. Conf. on Machine Learning (ICML), vol. 48. PMLR, 2016, pp. 2990–2999

  19. [19]

    Har- monic networks: Deep translation and rotation equivariance,

    D. E. Worrall, S. J. Garbin, D. Turmukhambetov, and G. J. Brostow, “Har- monic networks: Deep translation and rotation equivariance,” inProc. of the IEEE conference on computer vision and pattern recognition, 2017, pp. 5028– 5037

  20. [20]

    Generalizing convolu- tional neural networks for equivariance to lie groups on arbitrary continuous data,

    M. Finzi, S. Stanton, P. Izmailov, and A. G. Wilson, “Generalizing convolu- tional neural networks for equivariance to lie groups on arbitrary continuous data,” inProc. 37th Int. Conf. on Machine Learning, vol. 119. PMLR, 13–18 Jul 2020, pp. 3165–3176

  21. [21]

    Deep sets,

    M. Zaheer, S. Kottur, S. Ravanbakhsh, B. Poczos, R. Salakhutdinov, and A. J. Smola, “Deep sets,” inAdvances in Neural Information Processing Systems, vol. 30, 2017

  22. [22]

    Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

    M. M. Bronstein, J. Bruna, T. Cohen, and P. Veli ˇckovi´c, “Geometric deep learning: Grids, groups, graphs, geodesics, and gauges,”arXiv preprint arXiv:2104.13478, 2021

  23. [23]

    A group-theoretic framework for data augmentation,

    S. Chen, E. Dobriban, and J. H. Lee, “A group-theoretic framework for data augmentation,”Journal of Machine Learning Research, vol. 21, no. 245, pp. 1–71, 2020

  24. [24]

    Group invariant machine learning by fundamental domain projections,

    B. Aslan, D. Platt, and D. Sheard, “Group invariant machine learning by fundamental domain projections,” inNeurIPS Workshop on Symmetry and Geometry in Neural Representations, 2023, pp. 181–218

  25. [25]

    Unsupervised representation learn- ing by predicting image rotations,

    S. Gidaris, P. Singh, and N. Komodakis, “Unsupervised representation learn- ing by predicting image rotations,” inProc. of the Int. Conf. on Learning Representations (ICLR), Vancouver, Canada, April 2018

  26. [26]

    A simple framework for contrastive learning of visual representations,

    T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” inProc. of the 37th Int. Conf. on Machine Learning, ser. ICML’20. JMLR.org, 2020

  27. [27]

    Emerging properties in self-supervised vision transformers,

    M. Caron, H. Touvron, I. Misra, H. Jegou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in 2021 IEEE/CVF Int. Conf. on Computer Vision (ICCV), 2021, pp. 9630–9640

  28. [28]

    Learning invariances in neural networks from training data,

    G. Benton, M. Finzi, P. Izmailov, and A. G. Wilson, “Learning invariances in neural networks from training data,” inAdvances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 17 605–17 616

  29. [29]

    Reducing label effort: Self-supervised meets active learning,

    J. Z. Bengar, J. van de Weijer, B. Twardowski, and B. Raducanu, “Reducing label effort: Self-supervised meets active learning,” inProc. of the IEEE/CVF Int. Conf. on Computer Vision Workshops (ICCVW), 2021

  30. [30]

    Bridging diver- sity and uncertainty in active learning with self-supervised pre-training,

    P. Doucet, B. Estermann, T. Aczel, and R. Wattenhofer, “Bridging diver- sity and uncertainty in active learning with self-supervised pre-training,” in5th Workshop on Practical ML for Limited/Low Resource Settings (PML4LRS@ICLR), 2024

  31. [31]

    Integrating deep metric learning with coreset for active learning in 3d segmentation,

    A. Vepa, Z. Yang, A. Choi, J. Joo, F. Scalzo, and Y . Sun, “Integrating deep metric learning with coreset for active learning in 3d segmentation,” inAdvances in Neural Information Processing Systems 38 (NeurIPS 2024), 2024

  32. [32]

    Empowering active learning for 3d molecular graphs with geometric graph isomorphism,

    R. Subedi, L. Wei, W. Gao, S. Chakraborty, and Y . Liu, “Empowering active learning for 3d molecular graphs with geometric graph isomorphism,” in Advances in Neural Information Processing Systems 38 (NeurIPS 2024), 2024

  33. [33]

    M. A. Armstrong,Groups and symmetry. Springer Science & Business Media, 1997

  34. [34]

    Learning symmetrization for equivariance with orbit distance minimization,

    D. T. Nguyen, J. Kim, H. Yang, and S. Hong, “Learning symmetrization for equivariance with orbit distance minimization,” inNeurIPS Workshop on Symmetry and Geometry in Neural Representations, 2023

  35. [35]

    Low-dimensional invariant embeddings for uni- versal geometric learning,

    N. Dym and S. J. Gortler, “Low-dimensional invariant embeddings for uni- versal geometric learning,”Foundations of Computational Mathematics, pp. 1–41, 2024

  36. [36]

    Autonomous driving system: A comprehensive survey,

    J. Zhao, W. Zhao, B. Deng, Z. Wang, F. Zhang, W. Zheng, W. Cao, J. Nan, Y . Lian, and A. F. Burke, “Autonomous driving system: A comprehensive survey,”Expert Systems with Applications, p. 122836, 2023

  37. [37]

    Improving generalization with active learning,

    D. Cohn, L. Atlas, and R. Ladner, “Improving generalization with active learning,”Machine learning, vol. 15, pp. 201–221, 1994

  38. [38]

    A survey on active learning strategy,

    L.-L. Sun and X.-Z. Wang, “A survey on active learning strategy,” in2010 Int. Conf. on Machine Learning and Cybernetics, vol. 1. IEEE, 2010, pp. 161–166

  39. [39]

    A survey of deep active learning,

    P. Ren, Y . Xiao, X. Chang, P.-Y . Huang, Z. Li, B. B. Gupta, X. Chen, and X. Wang, “A survey of deep active learning,”ACM computing surveys (CSUR), vol. 54, no. 9, pp. 1–40, 2021

  40. [40]

    Active learning query strategies for classification, regression, and clustering: A survey,

    P. Kumar and A. Gupta, “Active learning query strategies for classification, regression, and clustering: A survey,”Journal of Computer Science and Technology, vol. 35, pp. 913–945, 2020

  41. [41]

    A survey on active learning: State-of-the-art, practical challenges and research directions,

    A. Tharwat and W. Schenck, “A survey on active learning: State-of-the-art, practical challenges and research directions,”Mathematics, vol. 11, no. 4, p. 820, 2023

  42. [42]

    Core-sets: Updated survey,

    D. Feldman, “Core-sets: Updated survey,”Sampling Techniques for Super- vised or Unsupervised Tasks, pp. 23–44, 2020

  43. [43]

    Clustering to minimize the maximum intercluster distance,

    T. Gonzalez, “Clustering to minimize the maximum intercluster distance,” Theoretical Computer Science, vol. 38, pp. 293–306, 1985

  44. [44]

    Probabilistic symmetries and invariant neural networks,

    B. Bloem-Reddy and Y . W. Teh, “Probabilistic symmetries and invariant neural networks,”Journal of Machine Learning Research, vol. 21, no. 90, pp. 1–61, 2020

  45. [45]

    A survey on image data augmentation for deep learning,

    C. Shorten and T. M. Khoshgoftaar, “A survey on image data augmentation for deep learning,”Journal of Big Data, vol. 6, no. 1, p. 60, 2019

  46. [46]

    Equivariant repre- sentation learning via class-pose decomposition,

    G. L. Marchetti, G. Tegnér, A. Varava, and D. Kragic, “Equivariant repre- sentation learning via class-pose decomposition,” inInt. Conf. on Artificial Intelligence and Statistics. PMLR, 2023, pp. 4745–4756

  47. [47]

    Symmetry- adapted representation learning,

    F. Anselmi, G. Evangelopoulos, L. Rosasco, and T. Poggio, “Symmetry- adapted representation learning,”Pattern Recognition, vol. 86, pp. 201–208, 2019

  48. [48]

    Structuring represen- tations using group invariants,

    M. Shakerinava, A. K. Mondal, and S. Ravanbakhsh, “Structuring represen- tations using group invariants,”Advances in Neural Information Processing Systems, vol. 35, pp. 34 162–34 174, 2022

  49. [49]

    Estimation under group actions: recovering orbits from invariants,

    A. S. Bandeira, B. Blum-Smith, J. Kileel, J. Niles-Weed, A. Perry, and A. S. Wein, “Estimation under group actions: recovering orbits from invariants,” Applied and Computational Harmonic Analysis, vol. 66, pp. 236–319, 2023

  50. [50]

    Diffusion maps for group-invariant manifolds,

    P. Hoyos and J. Kileel, “Diffusion maps for group-invariant manifolds,”arXiv preprint arXiv:2303.16169, 2023

  51. [51]

    Burago, Y

    D. Burago, Y . Burago, and S. Ivanov,A Course in Metric Geometry. Amer- ican Mathematical Society, 2022, vol. 33

  52. [52]

    Unsupervised learning of group invariant and equivariant representations,

    R. Winter, M. Bertolini, T. Le, F. Noe, and D.-A. Clevert, “Unsupervised learning of group invariant and equivariant representations,”Advances in Neural Information Processing Systems, vol. 35, pp. 31 942–31 956, 2022

  53. [53]

    Learning (approxi- mately) equivariant networks via constrained optimization,

    A. Manolache, L. F. O. Chamon, and M. Niepert, “Learning (approxi- mately) equivariant networks via constrained optimization,”arXiv preprint arXiv:2505.13631, 2025

  54. [54]

    A Bernstein-type inequality for functions of bounded interaction

    A. Maurer, “A bernstein-type inequality for functions of bounded interaction,” arXiv preprint arXiv:1701.06191, 2017

  55. [55]

    A gentle introduction to concentration inequalities,

    K. Sridharan, “A gentle introduction to concentration inequalities,”Dept. Comput. Sci., Cornell Univ., Tech. Rep, pp. 1–21, 2002

  56. [56]

    Learning multiple layers of features from tiny images,

    A. Krizhevsky, G. Hintonet al., “Learning multiple layers of features from tiny images,” 2009

  57. [57]

    An analysis of single-layer networks in unsu- pervised feature learning,

    A. Coates, A. Ng, and H. Lee, “An analysis of single-layer networks in unsu- pervised feature learning,” inProc. of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conf. Proc., 2011, pp. 215–223

  58. [58]

    Gradient-based learning applied to document recognition,

    Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,”Proc. of the IEEE, vol. 86, no. 11, pp. 2278–2324, 2002

  59. [59]

    Deep batch active learning by diverse, uncertain gradient lower bounds,

    J. T. Ash, C. Zhang, A. Krishnamurthy, J. Langford, and A. Agarwal, “Deep batch active learning by diverse, uncertain gradient lower bounds,” inInt. Conf. on Learning Representations (ICLR), 2020. PREPRINT 15 APPENDIX PROOF OF THEGENERALIZATIONTHEOREM This proof of Theorem 1 complements Subsection III-C by bounding the generalization gap via a decompositi...