What are the main methodological criticisms of high-end estimates like the Yale 22.1 million figure?
Executive summary
High-end estimates such as the Yale/MIT paper’s mean of 22.1 million undocumented immigrants hinge on a complex demographic model that produced a wide 95% probability range (16–29 million) after one million simulations, and the main methodological criticisms focus on unreliable inputs, implausible historical flow assumptions, conflicts with administrative “footprints,” and risks from overconfident communication of uncertain model outputs [1] [2] [3].
1. Model inputs are opaque and highly sensitive — small errors cascade into large differences
Critics argue the Yale team’s bottom‑up modeling depends on numerous parameters — arrival rates, undercount adjustments, emigration, mortality and naturalization rates — and that uncertainty in those inputs can produce dramatically different totals; the study itself acknowledged large uncertainty and presented a wide simulation range, but opponents say that admitting a broad range doesn’t neutralize the fact that the chosen priors and parameter distributions drive the result [1] [2].
2. The 1990s arrival assumptions look “completely implausible” to many demographers
A core dispute centers on the paper’s reconstruction of undocumented inflows in the 1990s: several experts, including analysts at the Migration Policy Institute and critics summarized in reporting, contend the study relies on implausibly large arrival numbers for that decade — a period central to inflating the cumulative total — and that accepted demographic patterns of growth and change “just doesn’t fit” the 22 million figure [4] [3].
3. “People leave footprints”: the paper conflicts with administrative and survey records
Leading demography centers and critics point out that births, school enrollments, death records, tax and social‑service data produce observable signals that are difficult to reconcile with a doubling of the consensus 11 million estimate; skeptics say accepting 22 million requires believing that Census Bureau surveys and a wide swath of administrative data “missed huge numbers of people,” an assertion many experts call unlikely [5] [6].
4. Millions of simulations are not a panacea — garbage‑in, garbage‑out remains the core critique
The authors ran one million Monte Carlo trials to propagate uncertainty, but critics warn that simulating a model exhaustively only multiplies the consequences of any faulty assumptions rather than validating them — in short, generating a million realizations of a poorly specified model does not prove the central estimates are correct, and some commentators say the statistical basis was “woefully inadequate” and produced the wrong answer repeatedly [6] [1].
5. Consensus institutions stuck with lower numbers, underscoring methodological divergence
Major institutions — including Pew Research Center and long‑standing demographic researchers — maintained estimates near 11–12 million and quickly rejected the 22 million headline as unsupported, highlighting a methodological split: where the Yale team used an operations‑modeling approach, institutional practice has relied on survey‑based and administrative‑consistency checks, and that divergence feeds skepticism about which methods best capture hidden populations [3] [5].
6. Communication, political incentives, and the risk of misuse
Observers also criticize how the study’s point estimate has been amplified in media and political discourse despite the stated range; scholars warn that the dramatic headline number can be seized by ideological actors on both sides — either to demand tougher enforcement or to argue for sweeping policy changes — and that the authors themselves cautioned the figure could be cited “uncritically by some news outlets” and used for ideological purposes [6] [1].
Conclusion: credible caution, not dismissal, is the methodological takeaway
The methodological criticisms converge on a single theme: high‑end estimates like 22.1 million are driven less by incontrovertible new evidence than by alternative modeling choices and contentious assumptions about past flows and undercounts; while the Yale study demonstrates that plausible alternative accounting could yield much higher totals (and offers a wide uncertainty band), mainstream demographers and policy shops reject the specific magnitude largely because it conflicts with other demographic signals and because exhaustive simulation cannot fix weak or implausible inputs [1] [5] [4].