citations

When Pith reviews a paper, every reference becomes its own page. Works cited by many reviewed papers bubble up as foundational candidates for the next Recognition Review. The graph below is the live result.

browse foundational works

The most-cited unreviewed papers in Pith's corpus. A live queue for future review.

submit for recognition review

Already reviewed by Pith? Request a deeper Recognition pass. Cite the result.

trace a paper's references

Open any reviewed paper in the home feed to see its bibliography graph.

foundational works · sample

Top-cited works in the Pith corpus. Reviewed items show their full Pith recap (verdict, novelty, lean and Recognition status) inline; pending items get auto-promoted to a full review on the next sweep.

Adam: A Method for Stochastic Optimization
2073 Pith citing papers 2014 arXiv:1412.6980 accept novelty 7.5 4 RS theorems

A first-order stochastic optimizer that maintains bias-corrected exponential moving averages of the gradient and its square, dividing the former by the square root of the latter to set per-parameter step sizes.
Proximal Policy Optimization Algorithms
1993 Pith citing papers 2017 arXiv:1707.06347 accept novelty 7.0 3 RS theorems

A clipped surrogate objective L^CLIP = E[min(r_t A_t, clip(r_t, 1-ε, 1+ε) A_t)] enables multi-epoch minibatch policy updates with TRPO-like stability but first-order optimization.
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
1924 Pith citing papers 2024 arXiv:2402.03300 unverdicted novelty 6.0

DeepSeekMath 7B reaches 51.7% on MATH via continued pretraining on curated web math data and Group Relative Policy Optimization.
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
1897 Pith citing papers 2025 arXiv:2501.12948 unverdicted novelty 6.0

Pure reinforcement learning on LLMs produces emergent reasoning patterns and outperforms supervised models trained on human demonstrations on verifiable math, coding, and STEM tasks.
Training Verifiers to Solve Math Word Problems
1465 Pith citing papers 2021 arXiv:2110.14168 conditional novelty 6.0

Introduces GSM8K dataset and demonstrates that verifier-based selection of solutions from multiple candidates outperforms fine-tuning baselines on math word problems.
Evaluating Large Language Models Trained on Code
1308 Pith citing papers 2021 arXiv:2107.03374 accept novelty 6.0 2 RS theorems

Codex achieves 28.8% pass@1 on HumanEval, rising to 70.2% with 100 samples per problem via repeated sampling.
LLaMA: Open and Efficient Foundation Language Models
1301 Pith citing papers 2023 arXiv:2302.13971 conditional novelty 7.0

LLaMA models (7B-65B params) trained only on public data outperform GPT-3 (175B) on most benchmarks and match top closed models like Chinchilla-70B and PaLM-540B.
Decoupled Weight Decay Regularization
1222 Pith citing papers 2017 arXiv:1711.05101 review pending
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
1114 Pith citing papers 2020 arXiv:2010.11929 accept novelty 9.0

Vision Transformer (ViT) applies a standard transformer directly to image patches and matches or exceeds state-of-the-art CNN performance on classification benchmarks after large-scale pre-training.
Llama 2: Open Foundation and Fine-Tuned Chat Models
1089 Pith citing papers 2023 arXiv:2307.09288 unverdicted novelty 6.0

Llama 2 introduces pretrained and chat-optimized LLMs up to 70B parameters that surpass other open chat models on standard benchmarks and human evaluations of helpfulness and safety.

browse all foundational works ›

Pith currently tracks 3148009 canonical cited works across 111235 reviewed papers. 2983 have full external metadata; the rest are still resolving. 37448 verdicts are linked to a Recognition theorem. See /sitemap for full coverage.