pith. sign in

arxiv: 2607.00373 · v1 · pith:DXEG3U2Cnew · submitted 2026-07-01 · 📊 stat.ME

Confidence Intervals for the Risk Difference in Combined Unilateral and Bilateral Data Incorporating a Distribution-Based Approach

Pith reviewed 2026-07-02 08:09 UTC · model grok-4.3

classification 📊 stat.ME
keywords risk differenceconfidence intervalunilateral bilateral datapaired binary outcomesdistribution-based approachMOVERintra-subject correlationskewness
0
0 comments X

The pith

Distribution-based intervals for risk difference in paired binary data achieve nominal coverage while capturing skewness in small samples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes deriving a confidence interval directly from the probability distribution of the risk difference estimator, together with a modified MOVER procedure, for studies that combine unilateral and bilateral binary outcomes from paired organs. This addresses the limitation of existing asymptotic normality methods that often fail to reflect skewness when samples are small. Simulations across many parameter settings show the new intervals maintain coverage close to the nominal level and produce widths comparable to standard procedures, with the added ability to reflect finite-sample asymmetry. Real-data analyses of two datasets produce conclusions consistent with competing methods. The work supplies an alternative framework for interval estimation that explicitly incorporates the combined data structure and intra-subject correlation.

Core claim

A distribution-based confidence interval derived from the probability distribution of the risk difference estimator, together with a modified MOVER procedure that accounts for intra-subject correlation, yields coverage probabilities close to the nominal level and interval widths comparable to asymptotic methods across a broad range of settings; in small samples it captures skewness in the sampling distribution that asymptotic normality methods do not reflect.

What carries the argument

The distribution-based confidence interval obtained by direct use of the probability distribution of the risk difference estimator (rather than asymptotic normality), combined with a modified MOVER procedure that incorporates intra-subject correlation.

If this is right

  • As sample size increases, the proposed distribution-based interval exhibits satisfactory performance comparable to existing methods.
  • The distribution-based interval achieves coverage probabilities close to the nominal level with interval widths comparable to those of existing procedures.
  • In small-sample settings the distribution-based interval captures skewness in the sampling distribution that is not reflected by methods relying on asymptotic normality.
  • Analyses of real-world datasets demonstrate practical applicability and yield consistent inferential conclusions across methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same distributional derivation could be applied to other effect measures such as the risk ratio or odds ratio under the same unilateral-plus-bilateral structure.
  • The approach points toward possible gains in accuracy for small-sample inference in other clustered binary settings that exhibit intra-cluster dependence.
  • Future numerical work could examine whether the method remains computationally feasible when the number of bilateral pairs grows large or when additional covariates are present.

Load-bearing premise

An accurate probability distribution for the risk difference estimator can be derived that properly accounts for the combined unilateral and bilateral structure together with intra-subject correlation.

What would settle it

Repeated simulation studies in small-sample regimes with known skewness parameters where the empirical coverage of the proposed intervals falls materially below the nominal level (for example, below 90 percent for nominal 95 percent intervals).

Figures

Figures reproduced from arXiv: 2607.00373 by Chang-Xing Ma, Jia Zhou.

Figure 1
Figure 1. Figure 1: Box plots of the empirical coverage probability (ECP) for the three asymptotic methods [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Box plots of the mean interval width (MIW) for methods under consideration. The [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Box plots of the ratio of mesial non-coverage probability to the distal non-coverage [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
read the original abstract

Combined unilateral and bilateral binary outcomes frequently arise in studies involving paired organs. The risk difference is a clinically interpretable measure for comparing treatment effects between groups. Existing confidence interval methods are primarily based on asymptotic normality and may fail to adequately reflect finite-sample distributional features, particularly skewness. To address this issue, we propose a distribution-based confidence interval derived from the probability distribution of the risk difference estimator and a modified MOVER procedure that accounts for intra-subject correlation. Their performances are compared with those of commonly used asymptotic methods through extensive simulation studies. Across a broad range of parameter settings, all methods exhibited satisfactory performance as sample size increased. The proposed distribution-based interval achieved coverage probabilities close to the nominal level with interval widths comparable to those of existing procedures. In small sample settings, it was able to capture skewness in the sampling distribution that was not reflected by methods relying on asymptotic normality. Analyses of two real-world datasets demonstrated the practical applicability of the competing methods and yielded consistent inferential conclusions. The proposed approach provides an alternative framework for interval estimation of the risk difference in studies involving combined unilateral and bilateral binary outcomes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a distribution-based confidence interval for the risk difference estimator in studies with combined unilateral and bilateral binary outcomes, derived directly from the probability distribution of the estimator, together with a modified MOVER procedure that incorporates intra-subject correlation. These are compared via simulation to standard asymptotic-normality methods across a range of parameter settings; the distribution-based interval is reported to achieve coverage close to the nominal level with comparable widths and to capture skewness in small samples that asymptotic methods miss. Two real datasets are analyzed to illustrate practical use, with all methods yielding consistent conclusions.

Significance. If the claimed derivation of the exact sampling distribution is correct and correctly encodes the unilateral/bilateral structure plus correlation, the work supplies a practical small-sample alternative for a common design in paired-organ studies. The reported simulation evidence of near-nominal coverage and explicit skewness capture, together with the real-data applications, would constitute a useful contribution to the interval-estimation literature for correlated binary data.

major comments (2)
  1. [Abstract / Methods] Abstract and Methods: the central claim that the interval is 'derived from the probability distribution of the risk difference estimator' is load-bearing for the reported small-sample skewness advantage, yet the manuscript supplies neither the explicit probability mass function, the enumeration over the four paired binary outcomes per subject, nor the weighting by the intra-subject correlation parameter. Without this construction it is impossible to verify that the distribution is correctly specified rather than approximated.
  2. [Simulation studies] Simulation section: coverage probabilities are stated to be 'close to the nominal level' and widths 'comparable,' but no table or text reports the Monte Carlo standard errors on those coverage estimates, the exact grid of sample sizes, prevalence values, or correlation parameters, or the number of replications, preventing assessment of whether the reported advantage over asymptotic methods is statistically reliable.
minor comments (2)
  1. [Abstract] The abstract refers to 'extensive simulation studies' and 'two real-world datasets' without naming the datasets or providing even a brief description of their structure (e.g., number of subjects, proportion unilateral vs. bilateral).
  2. [Methods] Notation for the risk difference estimator and the correlation parameter is introduced without an explicit equation linking them to the four possible paired outcomes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive overall assessment. We address each major comment below and will revise the manuscript accordingly to improve clarity and reproducibility.

read point-by-point responses
  1. Referee: [Abstract / Methods] Abstract and Methods: the central claim that the interval is 'derived from the probability distribution of the risk difference estimator' is load-bearing for the reported small-sample skewness advantage, yet the manuscript supplies neither the explicit probability mass function, the enumeration over the four paired binary outcomes per subject, nor the weighting by the intra-subject correlation parameter. Without this construction it is impossible to verify that the distribution is correctly specified rather than approximated.

    Authors: We acknowledge that the explicit probability mass function, the enumeration over the four paired binary outcomes, and the explicit weighting by the intra-subject correlation parameter were not presented in sufficient detail. The distribution-based CI is constructed by enumerating the joint probabilities of the unilateral and bilateral binary outcomes per subject under the specified correlation structure. In the revised manuscript we will add the full derivation, including the explicit PMF and the step-by-step weighting procedure, to the Methods section so that the construction can be directly verified. revision: yes

  2. Referee: [Simulation studies] Simulation section: coverage probabilities are stated to be 'close to the nominal level' and widths 'comparable,' but no table or text reports the Monte Carlo standard errors on those coverage estimates, the exact grid of sample sizes, prevalence values, or correlation parameters, or the number of replications, preventing assessment of whether the reported advantage over asymptotic methods is statistically reliable.

    Authors: We agree that Monte Carlo standard errors, the precise simulation grid, and the number of replications should be reported. The simulations used 10,000 replications; we will add these details together with Monte Carlo standard errors (computed as sqrt(p(1-p)/R) for each coverage probability) and a complete table of the sample-size, prevalence, and correlation grid in the revised Simulation section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper derives the distribution-based CI directly from the probability distribution of the risk difference estimator for combined unilateral/bilateral data, then evaluates it via simulation against asymptotic methods. No quoted equations or steps reduce a claimed prediction to a fitted parameter by construction, nor does any load-bearing premise collapse to a self-citation chain. The abstract and description present an independent combinatorial or enumerative construction of the sampling distribution that encodes intra-subject correlation, with external validation through coverage and width comparisons. This matches the default expectation of a non-circular paper whose central result is not tautological with its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the distribution-based method implies unstated modeling assumptions about the sampling distribution but none are detailed.

pith-pipeline@v0.9.1-grok · 5723 in / 1129 out tokens · 25396 ms · 2026-07-02T08:09:30.803916+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 2 canonical work pages

  1. [1]

    Statistical methods in ophthalmology: an adjustment for the intraclass correlation between eyes.Biometrics, 38, March 1982

    Bernard Rosner. Statistical methods in ophthalmology: an adjustment for the intraclass correlation between eyes.Biometrics, 38, March 1982

  2. [2]

    Gerade E. Dallal. Paired bernoulli trials.Biometrics, 44, March 1988

  3. [3]

    Statistical methods in ophthalmology: an adjusted chi-square approach.Bio- metrics, 45, 1989

    Allan Donner. Statistical methods in ophthalmology: an adjusted chi-square approach.Bio- metrics, 45, 1989

  4. [4]

    J. R. Thompson. The chi-square test for data collected on eyes.British Journal of Ophthal- mology, 77(2), 1993. 21

  5. [5]

    Testing the homogeneity of two proportions for correlated bilateral data via the clayton copula.arXiv preprint arXiv:2502.00523, 2025

    Shuyi Liang, Takeshi Emura, Chang-Xing Ma, Yijing Xin, and Xin-Wei Huang. Testing the homogeneity of two proportions for correlated bilateral data via the clayton copula.arXiv preprint arXiv:2502.00523, 2025

  6. [6]

    Goodness-of-fit tests for combined unilateral and bilateral data.Mathematics, 13(15):2501, 2025

    Jia Zhou and Chang-Xing Ma. Goodness-of-fit tests for combined unilateral and bilateral data.Mathematics, 13(15):2501, 2025

  7. [7]

    Testing the equality of proportions for correlated otolaryngologic data.Computational Statistics and Data Analysis, 52, 2008

    Nian-Sheng Tang, Man-Lai Tang, and Shi-Fang Qiu. Testing the equality of proportions for correlated otolaryngologic data.Computational Statistics and Data Analysis, 52, 2008

  8. [8]

    Testing the equality of two proportions for combined unilateral and bilateral data.Communications in Statistics—Simulation and Computation, 37(8), 2008

    Yan-bo Pei, Man-Lai Tang, and Jian-Hua Guo. Testing the equality of two proportions for combined unilateral and bilateral data.Communications in Statistics—Simulation and Computation, 37(8), 2008

  9. [9]

    Testing equality of proportions for correlated binary data in ophthalmologic studies.Journal of Biopharmaceutical Statistics, 27(4), 2017

    Chang-Xing Ma and Song Liu. Testing equality of proportions for correlated binary data in ophthalmologic studies.Journal of Biopharmaceutical Statistics, 27(4), 2017

  10. [10]

    Testing the equality of proportions for combined unilateral and bilateral data under equal intraclass correlation model.Statistics in Biopharmaceutical Research, 2022

    Chang-Xing Ma and Huipei Wang. Testing the equality of proportions for combined unilateral and bilateral data under equal intraclass correlation model.Statistics in Biopharmaceutical Research, 2022

  11. [11]

    A revisit to sample size and power calculations for testing odds ratio in two independent binomials.Biometrics, 69(2):530–536, 2013

    Fang Liu. A revisit to sample size and power calculations for testing odds ratio in two independent binomials.Biometrics, 69(2):530–536, 2013

  12. [12]

    Exact confidence intervals for the relative risk and the odds ratio.Biometrics, 71(4), 2015

    Weizhen Wang and Guogen Shan. Exact confidence intervals for the relative risk and the odds ratio.Biometrics, 71(4), 2015

  13. [13]

    Statistical inference for odds ratio of two proportions in bilateral correlated data.Axioms, 11(10), 2022

    Zhiming Li and Changxing Ma. Statistical inference for odds ratio of two proportions in bilateral correlated data.Axioms, 11(10), 2022

  14. [14]

    Simultaneous confidence bounds for relative risks in multiple compar- isons to control.Statistics in medicine, 29(30):3232–3244, 2010

    Bernhard Klingenberg. Simultaneous confidence bounds for relative risks in multiple compar- isons to control.Statistics in medicine, 29(30):3232–3244, 2010

  15. [15]

    Confidence intervals for ratios of proportions in stratified bilateral correlated data.Statistics and Its Interface, 18(4):411–443, 2025

    Wanqing Tian and Chang-Xing Ma. Confidence intervals for ratios of proportions in stratified bilateral correlated data.Statistics and Its Interface, 18(4):411–443, 2025

  16. [16]

    Interval estimation of relative risks for combined unilateral and bilateral correlated data.Journal of Biopharmaceutical Statistics, 35(2):163–186, 2025

    Kejia Wang and Chang-Xing Ma. Interval estimation of relative risks for combined unilateral and bilateral correlated data.Journal of Biopharmaceutical Statistics, 35(2):163–186, 2025

  17. [17]

    Relative risk versus absolute risk: one cannot be interpreted without the other.Nephrology Dialysis Transplanta- tion, 32(suppl 2):ii13–ii18, 2017

    Marlies Noordzij, Merel van Diepen, Fergus C Caskey, and Kitty J Jager. Relative risk versus absolute risk: one cannot be interpreted without the other.Nephrology Dialysis Transplanta- tion, 32(suppl 2):ii13–ii18, 2017

  18. [18]

    An assessment of clinically useful measures of the consequences of treatment.New England journal of medicine, 318(26):1728– 1733, 1988

    Andreas Laupacis, David L Sackett, and Robin S Roberts. An assessment of clinically useful measures of the consequences of treatment.New England journal of medicine, 318(26):1728– 1733, 1988

  19. [19]

    Basic statistics for clinicians: 3

    Roman Jaeschke, Gordon Guyatt, Harry Shannon, Stephan Walter, Deborah Cook, and Nancy Heddle. Basic statistics for clinicians: 3. assessing the effects of treatment: measures of association.CMAJ: Canadian Medical Association Journal, 152(3):351, 1995. 22

  20. [20]

    Odds ratio, relative risk, absolute risk reduction, and the number needed to treat—which of these should we use?Value in health, 5(5):431–436, 2002

    Edna Schechtman. Odds ratio, relative risk, absolute risk reduction, and the number needed to treat—which of these should we use?Value in health, 5(5):431–436, 2002

  21. [21]

    Clinically useful measures of effect in binary analyses of randomized trials.Journal of clinical epidemiology, 47(8):881–889, 1994

    John C Sinclair and Michael B Bracken. Clinically useful measures of effect in binary analyses of randomized trials.Journal of clinical epidemiology, 47(8):881–889, 1994

  22. [22]

    The number needed to treat: a clinically useful measure of treatment effect.Bmj, 310(6977):452–454, 1995

    Richard J Cook and David L Sackett. The number needed to treat: a clinically useful measure of treatment effect.Bmj, 310(6977):452–454, 1995

  23. [23]

    Andrew P Grieve. The number needed to treat: a useful clinical measure or a case of the emperor’s new clothes?Pharmaceutical Statistics: The Journal of Applied Statistics in the Pharmaceutical Industry, 2(2):87–102, 2003

  24. [24]

    number needed to treat

    Finlay A McAlister. The “number needed to treat” turns 20—and continues to be used and misused.Cmaj, 179(6):549–553, 2008

  25. [25]

    Testing risk difference of two proportions for combined uni- lateral and bilateral data.arXiv preprint arXiv:2510.18834, 2025

    Jia Zhou and Chang-Xing Ma. Testing risk difference of two proportions for combined uni- lateral and bilateral data.arXiv preprint arXiv:2510.18834, 2025

  26. [26]

    Construction of confidence limits about effect measures: a general approach.Statistics in Medicine, 27(10), 2008

    Guang Yong Zou and Allan Donner. Construction of confidence limits about effect measures: a general approach.Statistics in Medicine, 27(10), 2008

  27. [27]

    Program Generation, Optimization, and Platform Adaptation

    Matteo Frigo and Steven G. Johnson. The design and implementation of FFTW3.Proceedings of the IEEE, 93(2):216–231, 2005. Special issue on “Program Generation, Optimization, and Platform Adaptation”

  28. [28]

    Probable inference, the law of succession, and statistical inference.Journal of the American Statistical Association, 22(158):209–212, 1927

    Edwin B Wilson. Probable inference, the law of succession, and statistical inference.Journal of the American Statistical Association, 22(158):209–212, 1927

  29. [29]

    Approximate is better than “exact” for interval estimation of binomial proportions.The American Statistician, 52(2):119–126, 1998

    Alan Agresti and Brent A Coull. Approximate is better than “exact” for interval estimation of binomial proportions.The American Statistician, 52(2):119–126, 1998

  30. [30]

    Duration of effusion after antibiotic treat- ment for acute otitis media: comparison of cefaclor and amoxicillin.The Pediatric Infectious Disease Journal, 1(5):310–316, 1982

    Ellen M Mandel, Charles D Bluestone, Howard E Rockette, Mark M Blatter, Keith S Reisinger, Frederick P Wucher, and James Harper. Duration of effusion after antibiotic treat- ment for acute otitis media: comparison of cefaclor and amoxicillin.The Pediatric Infectious Disease Journal, 1(5):310–316, 1982

  31. [31]

    Homogeneity tests and interval estimations of risk differences for stratified bilateral and unilateral correlated data.Statistical Papers, 65(6):3499–3543, 2024

    Shuyi Liang, Kai-Tai Fang, Xin-Wei Huang, Yijing Xin, and Chang-Xing Ma. Homogeneity tests and interval estimations of risk differences for stratified bilateral and unilateral correlated data.Statistical Papers, 65(6):3499–3543, 2024. 23