Post-selection inference for network structure
Pith reviewed 2026-07-02 00:51 UTC · model grok-4.3
The pith
Two new confidence intervals remain valid for network group densities even after selecting groups from the data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We develop two new confidence intervals that are universally valid post-selection in the sense that they guarantee simultaneous coverage asymptotically over all pairs of groups whose relative sizes do not vanish. Our first interval builds on a strategy of Berk et al. Our second interval is based on a Talagrand-type concentration inequality for empirical processes. Only the second interval achieves the best-possible width asymptotically up to a constant factor.
What carries the argument
The second confidence interval derived from a Talagrand-type concentration inequality for empirical processes, which delivers simultaneous coverage over selected group pairs and near-optimal width.
If this is right
- Both intervals are simple to compute and scale to large networks.
- The second interval achieves the best possible width asymptotically up to a constant factor.
- Some evidence for homophily in social networks and hub-and-spoke trade structures survives the correction.
- Evidence for disjoint market segments in worker transition networks does not survive the correction.
- The intervals guarantee simultaneous coverage over all qualifying pairs of groups.
Where Pith is reading between the lines
- Researchers studying community detection in networks may need to reexamine earlier density estimates that ignored selection.
- The concentration approach could extend to other network statistics such as betweenness or modularity if similar empirical process bounds hold.
- In labor economics, claims about segmented markets might become more conservative once post-selection adjustments are routine.
- Finite-sample performance could be checked via Monte Carlo experiments on networks of moderate size where group fractions are held fixed.
Load-bearing premise
The groups are selected from the observed network data in an asymptotic regime where their relative sizes do not vanish.
What would settle it
A simulation on a sequence of growing networks with known true densities where the intervals fail to cover at the nominal rate for fixed-fraction groups would disprove the asymptotic simultaneous coverage.
Figures
read the original abstract
Researchers often use the density of connections between groups of agents, such as communities, blocs, or markets, to characterize the structure of a social or economic network. In many cases, these groups are selected using the network data, making conventional fixed-group inference procedures potentially invalid. To address this issue, we develop two new confidence intervals that are universally valid post-selection in the sense that they guarantee simultaneous coverage asymptotically over all pairs of groups whose relative sizes do not vanish. Our first interval builds on a strategy of \cite{berk2013valid}. Our second interval is based on a Talagrand-type concentration inequality for empirical processes. Both intervals are simple to compute and scalable to large networks, but a key technical contribution of our paper is show that only the second interval achieves the best-possible width asymptotically up to a constant factor. Three empirical illustrations show that accounting for selection can matter in practice. Some evidence for homophily in a social network and a hub-and-spoke structure in a trade network survives our correction, while evidence for disjoint market segments in a worker transition network does not.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops two new confidence intervals for inference on network densities between groups that are selected from the observed network data itself. The intervals are constructed to deliver asymptotic simultaneous coverage over all pairs of groups whose relative sizes do not vanish. The first interval adapts the Berk et al. (2013) approach; the second uses a Talagrand-type concentration inequality for empirical processes. The paper establishes that only the second interval attains the best-possible asymptotic width up to a constant factor. Three empirical applications (social network, trade network, worker transition network) illustrate that selection correction can alter substantive conclusions about homophily, hub-and-spoke structure, and market segmentation.
Significance. If the coverage and optimality claims hold, the paper supplies a practical, scalable tool for a pervasive problem in empirical network analysis: valid inference after data-driven group selection. The explicit optimality result for the Talagrand-based interval (up to a constant factor) is a clear technical contribution that distinguishes the two proposals. The empirical illustrations provide concrete evidence that the correction matters in practice. These features would make the work a useful reference for applied researchers working with community detection, bloc identification, or market segmentation in networks.
minor comments (3)
- Abstract: the phrase 'three empirical illustrations' would benefit from naming the three networks (social, trade, worker transition) to give readers immediate context without needing to reach the main text.
- The manuscript relies on the external result of Berk et al. (2013) for the first interval and on Talagrand-type bounds for the second; a short self-contained sketch of how these are applied to the network-density functional would improve accessibility even if the full proofs remain in an appendix.
- Notation: ensure that the definition of 'relative size' (used in the non-vanishing condition) is stated once in a single display equation or definition box so that readers can locate it without searching multiple sections.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity
full rationale
The derivation builds new post-selection confidence intervals by adapting the strategy from the external citation Berk et al. (2013) for the first interval and applying a Talagrand-type concentration inequality for the second. The key technical result—that only the second achieves asymptotically optimal width up to a constant factor—is established via the external inequality and asymptotic analysis under the stated regime (relative group sizes not vanishing). No step reduces a claimed prediction or result to a fitted parameter or self-defined quantity by construction, and the cited results are independent external references rather than self-citations. The paper is self-contained against its external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Asymptotic regime in which selected group relative sizes do not vanish
Reference graph
Works this paper leans on
-
[1]
The following Lemma 4 relates || · ||† to || · ||∞→1 and || · ||♠
≤1.7823. The following Lemma 4 relates || · ||† to || · ||∞→1 and || · ||♠. The upper bound in (A.3) is Theorem 3 of Gittens and Tropp (2009). The lower bound in (A.3) builds on an argument given in the third paragraph of Section 4.2 of Rudelson and Vershynin (2007). In this result, our contribution is to replace their use of Khintchine’s inequality (whic...
2009
-
[2]
Proof.The inequalityE[||Z|| ∞→1]≤2E[||Z|| †] is Theorem 3 of Gittens and Tropp (2009)
≤1.7823. Proof.The inequalityE[||Z|| ∞→1]≤2E[||Z|| †] is Theorem 3 of Gittens and Tropp (2009). To demonstrate inequality (2 KG(∞))−1E[||Z|| †]≤E[||Z|| ∞→1], we actually show the stronger result that for any N1 ×N 2 matrix X, ||X|| † ≤ 2KG(∞)||X|| ∞→1. To show the stronger result, we first show that ||X||1,2 ≤K G(∞)||X|| ∞→1. Lemma 3 above implies that P ...
2009
-
[3]
≤1.7823. Proof. Fix x > 0 and c∈ (0, 1/2]. We demonstrate the result in three steps. In the first step, we represent ||ϵ||□;c/(2B) as the maximum of an empirical process indexed by elements of Gc and apply Lemma 8 above to bound ||ϵ||□;c/(2B). The bounds depend on the expected value of the maximum of the empirical process. We bound this expectation in the...
2005
-
[4]
∈argmax (g1,g2)∈Gc P i∈[N1],j∈[N2] ϵijgi,1gj,2 , using an arbitrary deterministic tie-breaking rule, and letm ∗ t =P i∈[Nt] gi,t fort∈ {1,2}. We start with the lower bound ˆθ(g∗ 1, g∗ 2)−θ(g ∗ 1, g∗ 2) = 1 m∗ 1m∗ 2 X i∈[N1],j∈[N2] ϵijg∗ i,1g∗ j,2 ≥ 1 2√m∗ 1m∗ 2 p N1 min 1 N1 X i∈[N1] (ui)+, 1 N1 X i∈[N1] (ui)− + p N2 min 1 N2 X j∈[N2] (vj)+...
-
[5]
> 0 with probability approaching one, since ˆσ(g∗ 1, g∗ 2)/σ0 →p 1 and σ0 > 0. Combin- ing these facts with (A.22) and noting that 2 √ 2π = √ 8π, it follows that, with probability 74 approaching one along any asymptotic sequence of models inF 0 withN 1, N2 → ∞, ˆθ(g∗ 1, g∗ 2)−θ(g ∗ 1, g∗ 2) ≥(1−s/2) √N1 + √N2√ 8π ˆσ(g∗ 1, g∗ 2)√m∗ 1m∗ 2 >(1−s) √N1 + √N2√ ...
-
[6]
Since I(g∗ 1, g∗ 2; α) = ˆθ(g∗ 1, g∗
> 0 and the final inequality is (A.21). Since I(g∗ 1, g∗ 2; α) = ˆθ(g∗ 1, g∗
-
[7]
±[K׈σ(g ∗ 1, g∗ 2)]/ √m∗ 1m∗ 2, the event {θ(g∗ 1, g∗
-
[8]
Finally, since (g∗ 1, g∗
∈I (g∗ 1, g∗ 2; α)} implies |ˆθ(g∗ 1, g∗ 2)−θ(g ∗ 1, g∗ 2)| ≤Kˆσ(g∗ 1, g∗ 2)/√m∗ 1m∗ 2, and therefore lim F∈F 0:N 1,N2→∞ P(θ(g ∗ 1, g∗ 2)∈I(g ∗ 1, g∗ 2;α)) = 0. Finally, since (g∗ 1, g∗
-
[9]
∈ G c implies ∩(g1,g2)∈Gc{θ(g1, g2) ∈I (g1, g2; α)} ⊆ {θ (g∗ 1, g∗
-
[10]
Step 1:In this step we show the inequality (A.22)
∈I (g∗ 1, g∗ 2; α)}, and sinceF 0 ⊆ F, lim inf F∈F:N 1,N2→∞ P ∩(g1,g2)∈Gc {θ(g1, g2)∈I(g 1, g2;α)} = 0, which is the conclusion of the proposition. Step 1:In this step we show the inequality (A.22). By definition of ( g∗ 1, g∗ 2), we have that P i∈[N1],j∈[N2] ϵijg∗ i,1g∗ j,2 ≥ P i∈[N1],j∈[N2] ϵijg′ i,1g′ j,2 for any (g′ 1, g′
-
[11]
Choosing g′ j,2 = 1 for all j∈ [N2] and g′ i,1 = 1{ P j∈[N2] ϵij ≥ 0}1{ P i∈[N1] 1{ P j∈[N2] ϵij ≥ 0} ≥ N1/2} + 1{ P j∈[N2] ϵij ≤ 0}1{ P i∈[N1] 1{ P j∈[N2] ϵij ≥ 0}< N 1/2}
∈ G c. Choosing g′ j,2 = 1 for all j∈ [N2] and g′ i,1 = 1{ P j∈[N2] ϵij ≥ 0}1{ P i∈[N1] 1{ P j∈[N2] ϵij ≥ 0} ≥ N1/2} + 1{ P j∈[N2] ϵij ≤ 0}1{ P i∈[N1] 1{ P j∈[N2] ϵij ≥ 0}< N 1/2}. By construction, the entries of g′ 1 and g′ 2 take values in {0, 1},P j∈[N2] g′ j,2 = N2 andP i∈[N1] g′ i,1 ≥N 1/2, and so 75 (g′ 1, g′ 2)∈ G c for anyc∈[0,1/2]. Under this cho...
-
[12]
1− ˆσ(g∗ 1 ,g∗ 2 )−σ0 σ0 1+ ˆσ(g∗ 1 ,g∗ 2 )−σ0 σ0 # , C := E ui σ0 +
we have that X i∈[N1],j∈[N2] ϵijg′ i,1g′ j,2 = X i∈[N1] X j∈[N2] ϵij + 1{ X i∈[N1] 1{ X j∈[N2] ϵij ≥0} ≥N 1/2} + X i∈[N1] X j∈[N2] ϵij − 1{ X i∈[N1] 1{ X j∈[N2] ϵij ≥0}< N 1/2} ≥min X i∈[N1] X j∈[N2] ϵij + , X i∈[N1] X j∈[N2] ϵij − . By the same logic, choosingg′ i,1 = 1 for all i∈ [N1] and g′ j,2 = 1{ P i∈[N1] ϵij ≥ 0}1...
-
[13]
Since σ(g∗ 1, g∗
≤ 2 and so P 1 N1 X i∈[N1] ui σ0 + −E ui σ0 + " 1− ˆσ(g∗ 1 ,g∗ 2)−σ0 σ0 1 + ˆσ(g∗ 1 ,g∗ 2)−σ0 σ0 # > t/3 ≤P 1 N1 X i∈[N1] ui σ0 + −E ui σ0 + > t/6 +P(E C N). Since σ(g∗ 1, g∗
-
[14]
Since E[(u i)+]≥E[u i] = 0 and E (ui)2 + ≤E[(u i)2] = σ2 0, it follows that Var((ui)+) ≤σ 2 0, Var ui σ0 + ≤ 1, and so Var 1 N1 P i∈[N1] ui σ0 + ≤ 1 N1
= σ0 under F0 and Assumption 3(i) holds uniformly over Gc, we have P (EC N) → 0. Since E[(u i)+]≥E[u i] = 0 and E (ui)2 + ≤E[(u i)2] = σ2 0, it follows that Var((ui)+) ≤σ 2 0, Var ui σ0 + ≤ 1, and so Var 1 N1 P i∈[N1] ui σ0 + ≤ 1 N1 . It follows from Chebyshev’s inequality that for anyt >0, P 1 N1 X i∈[N1] ui σ0 + −E ui σ0 + ≥t/6 ≤ 36 t2N1 which c...
2009
-
[15]
Because Assumption 3(i) holds uniformly overG c, it follows that ∆N = ˆσ(g∗ 1, g∗ 2)−σ(g ∗ 1, g∗ 2) σ(g ∗ 1, g∗
:= q 1 m∗ 1m∗ 2 P i∈[N1],j∈[N2] σ2 ijg∗ i,1g∗ j,2 = σ0. Because Assumption 3(i) holds uniformly overG c, it follows that ∆N = ˆσ(g∗ 1, g∗ 2)−σ(g ∗ 1, g∗ 2) σ(g ∗ 1, g∗
-
[16]
Now define the event EN := |∆N | ≤ 1 2
=o p(1). Now define the event EN := |∆N | ≤ 1 2 . Since ∆ N = op(1), we have P (EC N) → 0. On the eventE N, ∆N 1+∆N ≤2|∆ N |, and so P E ui σ0 + ∆N 1 + ∆N > t/3 ≤P 2E ui σ0 + |∆N |> t/3 +P(E C N). Since E ui σ0 + is eventually bounded and ∆ N = op(1), the first term on the right-hand side converges to zero. The second term also converges to zero. This dem...
-
[17]
≤ 1.7823 denote Krivine’s upper bound for KG(∞) (Lemma 3). Since KG(∞) ≤ ¯KG, the lower-tail inequality (A.12) of Lemma 10 re- mains valid with KG(∞) replaced by ¯KG, because (18 ¯KG)−1E[||ϵ|| †]≤ (18KG(∞))−1E[||ϵ|| †] so the event on the left-hand side of (A.12) only shrinks; likewise the inequality ||X||1,2 ≤ KG(∞)||X|| ∞→1 from the proof of Lemma 4 hol...
-
[18]
By construction, |ˆθ(g∗ 1, g∗ 2)−θ(g∗ 1, g∗ 2)| = ||ϵ||□;c/(m∗ 1m∗ 2)
∈argmax (g1,g2)∈Gc P i∈[N1],j∈[N2] ϵijgi,1gj,2 , using an arbitrary deterministic tie- breaking rule, and let m∗ t :=P i∈[Nt] g∗ i,t for t∈ { 1, 2}. By construction, |ˆθ(g∗ 1, g∗ 2)−θ(g∗ 1, g∗ 2)| = ||ϵ||□;c/(m∗ 1m∗ 2). Since (g ∗ 1, g∗ 2)∈ G c, P ∩(g1,g2)∈Gc {θ(g1, g2)∈I(g 1, g2;α)} ≤P(θ(g ∗ 1, g∗ 2)∈I(g ∗ 1, g∗ 2;α)).(A.25) On the event {θ(g∗ 1, g∗
-
[19]
∈I (g∗ 1, g∗ 2; α)}, both θ(g∗ 1, g∗
-
[20]
belong to I(g∗ 1, g∗ 2; α) — the latter by hypothesis — and therefore |ˆθ(g∗ 1, g∗
-
[21]
−θ(g∗ 1, g∗ 2)| ≤ |I (g∗ 1, g∗ 2; α)|. For all (N1, N2) with min(N1, N2) large enough that the width hypothesis (4.3) is in force, it follows that, almost surely, {θ(g∗ 1, g∗ 2)∈I(g ∗ 1, g∗ 2;α)} ⊆ |ˆθ(g∗ 1, g∗ 2)−θ(g ∗ 1, g∗ 2)| ≤(1−δ ′)c ∗(α) ˆ¯τ m∗ 1m∗ 2 . (If ˆ¯τ= 0, the convention in (4.3) forces |I(g∗ 1, g∗ 2; α)| = 0, in which case |ˆθ−θ| (g∗ 1, g∗
-
[22]
= 0 on the displayed event and the containment holds trivially.) Hence P(θ(g ∗ 1, g∗ 2)∈I(g ∗ 1, g∗ 2;α))≤P |ˆθ(g∗ 1, g∗ 2)−θ(g ∗ 1, g∗ 2)| ≤(1−δ ′)c ∗(α) ˆ¯τ m∗ 1m∗ 2 .(A.26) Let r be the sequence of real numbers in Assumption 3(ii) and set r′ := r¯τ. Decomposing the right-hand side of (A.26) according to whether ˆ¯τ≤¯τ+r ′ gives P |ˆθ−θ|(g ∗ 1, g∗ 2)≤(1...
-
[23]
residual
−θ (g∗ 1, g∗ 2)| = ||ϵ||□;c/(m∗ 1m∗ 2), the first inequality is becauseF∈ F β,δ implies||σ|| F ≤C −1 β,δ E[||ϵ|| †], the second equality rearranges terms, and the second inequality is the lower-tail inequality (A.12) of Lemma 10 (with ¯KG in place of KG(∞), and noting that 2V 2 = 2 (||σ||2 F + 4BE[||ϵ|| †] +B||σ|| F )) applied withx=x β,δ. 84 Since r→ 0, ...
2015
-
[24]
Define ˜Y:=Y /Bso that the entries of ˜Yare bounded by 1 in absolute value. 89
-
[25]
Let ˜Y= P i∈[Nmin] siuivT i be the singular value decomposition of ˜Y
-
[26]
DefineS:={i:s i ≥(2 +η) √Nmax}andW:= P i∈S siuivT i
-
[27]
Define ˆµij :=B(W ij1{|W ij| ≤1}+1{W ij >1} −1{W ij <−1})
-
[28]
weak” version that restricts the full matrices µ or σ, and a “strong
Define ˆ¯τ:= 1.01||Y−ˆµ||†+0.25||Y−ˆµ||F , ˆV := p ||Y−ˆµ|| 2 F +B||Y−ˆµ|| F + 4B||Y−ˆµ|| †, and ˆσ(g1, g2) := ||(Y−ˆµ )g1g2||F /√m1m2 for (g1, g2) ∈ G c where the matrix ˆµhas ˆµij as itsijth entry and the matrix (Y−ˆµ)g 1g2 has (Yij −ˆµij)gi,1gj,2 as itsijth entry. B.2.2 Assumptions We make four additional assumptions about the class of random graph mod...
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.