What methods do demographers use to estimate the unauthorized immigrant population and how do those estimates differ?

Checked on January 24, 2026
Disclaimer: Factually can make mistakes. Please verify important information or breaking news. Learn more.

Executive summary

Demographers estimate the U.S. unauthorized immigrant population using several distinct approaches—most prominently the residual estimation method, survey-based imputation techniques, administrative-data linkage (SSN nonmatch) methods, and bottom-up microdata entry–exit models—and these approaches can produce different point estimates and trends because they rely on different data, assumptions and definitions [1] [2] [3] [4]. Understanding the methodological differences explains why prominent organizations (Pew, MPI, DHS/OHSS, academic teams, Fed researchers) report divergent totals and growth patterns: differences in source data, treatment of “liminal” statuses, assumptions about departures/deaths, and the timing of inputs drive most of the gap [5] [6] [7] [4].

1. Residual estimation: subtraction by design

The residual method — long the field’s workhorse — starts with a count of the total foreign‑born from large surveys (ACS or CPS) and subtracts a separately constructed demographic estimate of legally resident immigrants (built from admission records adjusted for deaths and departures), with the remainder interpreted as unauthorized immigrants [1] [2] [8]. This approach is transparent and leverages Census surveys and DHS admission records, which is why Pew and others still use it, but it is sensitive to errors in the underlying surveys, to assumptions about emigration and mortality, and to which temporary or humanitarian statuses are classified as “unauthorized” [2] [8] [6].

2. Survey imputation and logical/statistical classification

To describe characteristics of that residual population, researchers apply imputation techniques to survey microdata: logical imputation excludes people with clear legal markers while statistical imputation probabilistically assigns unauthorized status based on patterns in the data [9]. These methods let analysts estimate education, employment and household composition from ACS or CPS respondents, yet they inherit survey nonresponse and misreporting biases and require model choices that can shift subgroup totals; different teams treat asylum applicants, DACA, TPS and parolees differently, which changes counts [9] [6].

3. Administrative-linkage and SSN nonmatch innovations

A newer analytic path links household survey responses to Social Security Administration administrative records and flags nonmatching or missing SSNs as signals of likely unauthorized status, then applies adjustments to refine counts (the SSN nonmatch method) [3]. Authors of this approach report estimates consistent with residual-method totals after logical adjustments, but the method’s strength—direct use of administrative records—also means it depends on the quality of linkage, the interpretation of nonmatches (fraudulent, absent, or mismatched SSNs), and access to restricted data [3].

4. Bottom‑up microdata entry–exit models

An alternative “bottom‑up” family of methods constructs flows of entries and exits from microdata—using border and parole records, notice‑to‑appear datasets, and other administrative sources—to build monthly estimates of unauthorized population stock by age and work status (the Dallas Fed and CBO‑style approaches) [4]. These models can produce higher-frequency and local estimates and explicitly model net inflows, but they require assumptions about duration of stay, unobserved internal settlement, and uncounted exits, making them sensitive to changes in enforcement and parole practices [4].

5. Why estimates differ: data, definition, and timing

Differences in published estimates—Pew’s residual-based 14.0 million in 2023, MPI’s profiles that include liminal statuses, DHS historical tables, and other provisional CPS‑based estimates—stem from three levers: which surveys or administrative sources are used, whether temporary humanitarian statuses (parole, TPS, asylum applicants, DACA) are counted as unauthorized, and how departures/deaths are modeled and updated over time [5] [6] [7] [10]. Methodological choices also reflect institutional agendas or priorities—advocacy or policy groups may include or exclude liminal populations to highlight different policy impacts—so transparency about definitions matters [6] [10].

6. Practical implications and limits of the evidence

For policymakers and the public, the lesson is pragmatic: treat single estimates as method‑dependent signals rather than definitive counts; use methods matched to the question—residuals for broad stocks and characteristics (via ACS/CPS and imputation), SSN linkage for validation and administrative insight, and bottom‑up flow models for near‑real‑time local dynamics—and always report sensitivity to alternative assumptions [1] [3] [4]. Reporting limitations remain: available sources document the dominant techniques and their tradeoffs, but no single source can prove the “true” number without acknowledging the definitional and data tradeoffs described above [1] [9].

Want to dive deeper?
How does counting or excluding humanitarian parole and TPS holders change unauthorized immigrant estimates?
What sensitivity tests do researchers run on residual-method assumptions about emigration and mortality?
How have SSN nonmatch linkage estimates been validated against independent data sources?