How do ICE arrest and detention datasets differ in methodology and scope (arrests vs. detained population vs. removals)?

Checked on January 14, 2026
Disclaimer: Factually can make mistakes. Please verify important information or breaking news. Learn more.

Executive summary

ICE’s public arrest, detention, and removal tables are distinct datasets that capture different moments in the enforcement pipeline and are assembled with different counting rules: arrests record encounters or administrative apprehensions and can count events not unique people, detentions log book‑ins and book‑outs (a highly transient population with facility transfers) and have documented gaps and exclusions, and removals record completed deportation events rather than upstream encounters or custody spells [1] [2] [3]. Independent researchers and auditors warn these silos, inconsistent definitions, and ICE’s chosen exclusions mean simple comparisons across the datasets can mislead unless methodology and linkage are carefully accounted for [4] [5] [3].

1. Arrests: event-based logs, multiple encounters per person

ICE arrest datasets primarily log administrative arrests or “encounters” along with date, arrest method, and place, and the agency’s monthly tables explicitly note that people can be counted more than once during a reporting period because the unit is an event—not necessarily a unique individual [1] [2]. The Deportation Data Project and related documentation show arrest tables contain identifiers that can allow linkage to detentions and removals in theory, but the arrest data often omit arrests by other DHS components (like HSI) and can be inconsistent about location granularity, which complicates attribution of where and how an arrest occurred [6] [4] [3].

2. Detentions: book‑ins/book‑outs, transfers, and undercounts

ICE’s detention records are organized around book‑ins and book‑outs to document time spent in custody, but they reflect one of the most transient detention populations and frequently include multiple rows for multiple stints and transfers across facilities [7] [3]. Audits and research, including the GAO review and Vera Institute analyses, found that ICE’s public reporting understates the total number of people detained by excluding those initially booked into some temporary facilities and by having facility gaps and inconsistencies across dataset versions—problems that create tens of thousands of omitted detentions in public tallies [5] [8]. Third‑party groups have therefore simplified or de‑duplicated ICE detentions to create “one row per stay” files to make person‑level analysis feasible [3] [8].

3. Removals: completed deportation events, not custody history

Removals data record the acts of deportation and repatriation ICE effectuates—discrete, completed events that do not by themselves expose prior arrests or detention histories unless analysts link tables using unique identifiers [3] [1]. Removals are thus an output metric of enforcement, useful for counting outcomes, but they omit the upstream churn (multiple arrests, detentions, transfers) that produced that outcome and therefore cannot stand in for the scale of enforcement contacts without careful linkage [1].

4. Linkage potential and practical limits: IDs exist but datasets are siloed

ICE includes an anonymized unique A‑number column across its arrest, detention, detainer, and removal tables so researchers can, in principle, trace an individual from arrest to removal, and the Deportation Data Project reposts these close‑to‑original tables to facilitate such merges [3] [6] [1]. In practice, however, linkages are hampered by missing identifiers, inconsistent use of location fields, omissions from certain datasets (e.g., temporary facility bookings), and differences in what counts as an event versus a person—limitations documented by journalists, academics, and data projects [8] [4] [3].

5. Methodological transparency and contested interpretations

ICE asserts the integrity of its published statistics and warns data may fluctuate until fiscal‑year “lock” [9], while the GAO and independent researchers call for clearer reporting of methodologies and fuller inclusion of all detentions because current public methods can undercount and obscure patterns [5] [4]. Alternative perspectives exist: ICE and DHS officials maintain that the agency provides sufficient public information and that oversight mechanisms govern detention standards and reporting [7] [9], but outside audits and data‑science projects continue to document gaps and to produce adjusted datasets intended to support rigorous cross‑table analysis [8] [6].

Conclusion: read the tables as complementary but non‑equivalent

Arrest, detention, and removal datasets each answer different questions—who was encountered and how (arrests), who spent time in custody and when (detentions), and who was ultimately expelled (removals)—and meaningful analysis requires explicit attention to event versus person counts, documented exclusions, linkage quality, and the fact that ICE’s public methodology has been disputed by auditors and researchers [1] [5] [3]. Where possible, use the original ICE tables alongside processed reductions from groups like the Deportation Data Project and transparency critiques from GAO and Vera to triangulate claims about enforcement scale and composition [6] [5] [8].

Want to dive deeper?
How can researchers reliably link ICE arrest, detention, and removal records to measure individual enforcement pathways?
What specific temporary facilities or booking practices does GAO say are excluded from ICE’s public detention counts, and how large is the omission?
How do CBP encounter and Border Patrol arrest datasets differ from ICE arrest data, and how should analysts combine them?