What methodological approaches do criminologists use to estimate crime rates among undocumented populations?

Checked on January 30, 2026
Disclaimer: Factually can make mistakes. Please verify important information or breaking news. Learn more.

Executive summary

Criminologists estimate crime rates among undocumented populations by combining imperfect crime records with modeled population denominators, applying classification algorithms and robustness checks to compensate for data gaps; this mixed-methods strategy produces more precise estimates in some studies (notably Texas) but remains constrained by undercounting, nonresponse and enforcement-driven measurement error [1] [2] [3]. The result is a body of work that generally finds equal or lower offending rates for immigrants—including undocumented people—than for the U.S.-born, but that conclusion rests on methodological choices that researchers explicitly test with sensitivity analyses [2] [4] [5].

1. Data sources and the numerator problem: arrests, convictions, incarceration, and victimization surveys

Most empirical work starts with available criminal-justice records—arrests, convictions, and incarceration rolls—because these are where immigration status is sometimes recorded or can be linked, but each source measures different things and embeds biases: arrests reflect policing activity, convictions reflect prosecutorial decisions, and incarceration captures sentencing and post-conviction processes, while national victimization surveys (NCVS) aim to capture unreported crime but rarely record immigration status directly [3] [1] [6]. Researchers therefore acknowledge that focusing on arrests or incarceration can conflate law-enforcement emphasis on immigration with actual offending, a limitation discussed across the literature [3] [6].

2. Denominators: population estimates and the CMS/Pew approaches

Estimating rates requires a denominator—how many undocumented people live in a place—and criminologists rely on modeled population series from organizations like the Center for Migration Studies and Pew, which produce annual state and national estimates and are used to calculate per-capita rates for undocumented populations [2] [6]. The CMS methodology described in peer-reviewed work uses multistage classification, population controls by national origin, and adjustments for emigration, undercount, removals, status changes and mortality to produce narrower sampling error than older approaches, and researchers treat those modeled denominators as a critical input for rate estimation [1].

3. Classification methods: identifying undocumented status in datasets

When administrative records do not flag immigration status, scholars use classification procedures that merge survey responses, country-of-origin patterns, and population controls to probabilistically classify individuals as likely undocumented; the multistage CMS-like process—adjusting for underenumeration and length of residence—is standard practice in contemporary studies [1] [6]. In rare cases, like the Texas Department of Public Safety data, immigration status is directly recorded for arrestees, producing a unique “clean” numerator that researchers treat as a gold-standard test of broader methodology [2] [3].

4. Sensitivity analyses and robustness checks

Top-tier studies explicitly test whether results hold under alternative denominators, alternate definitions of “undocumented” at arrest, substitution of misdemeanors for felonies, and different administrative measures (arrest vs conviction vs incarceration), and report robustness across these permutations when claims are strong—e.g., lower arrest rates for undocumented people in Texas are robust to multiple alternative specifications [7] [2] [3]. Researchers also test for nonresponse bias in victimization surveys and for state heterogeneity in effects, using measures like police staffing or drug-overdose mortality to control for contextual confounders [1].

5. Interpretive constraints and competing findings

Methodological caveats pervade the field: undercounting of recent arrivals, fear-driven survey nonresponse, and the entanglement of local policing with federal immigration enforcement can bias both numerator and denominator in unknown directions, so many authors stop short of claiming definitive causal proof and instead emphasize robustness and plausibility checks [6] [1]. While a broad literature—including national, state, and city-level studies—finds immigrants commit crimes at similar or lower rates than the U.S.-born, there are nuanced findings and international exceptions noted in reviews: some analyses detect small associations between undocumented status and particular offense types in certain settings [4] [8].

6. What this means for policy and future research

The methodological toolkit—modeled denominators (CMS/Pew), administrative linkage where available (Texas DPS), probabilistic classification, multi-measure cross-checks, and sensitivity testing—permits credible estimation despite imperfect data, but the field needs broader, standardized recording of immigration status in crime datasets, careful attention to enforcement-driven measurement error, and more replication across jurisdictions to move from robust descriptive patterns to firmer causal claims [2] [3] [1].

Want to dive deeper?
How do the Center for Migration Studies and Pew model the undocumented population and what are their key differences?
What are the strengths and limitations of the Texas Department of Public Safety dataset for studying immigrant criminality?
How does law-enforcement collaboration with federal immigration authorities affect crime statistics and reporting?