How does ICE classify 'criminal' in its arrest and detention data, and where can the raw datasets be accessed?

Checked on January 21, 2026
Disclaimer: Factually can make mistakes. Please verify important information or breaking news. Learn more.

Executive summary

ICE’s public datasets and FOIA releases classify someone as “criminal” based primarily on whether the agency’s records show a conviction or pending criminal charge; that classification appears as discrete fields (e.g., “convicted — yes/no,” “apprehension criminality”) rather than a nuanced risk or offense-type taxonomy [1] [2] [3]. Raw, individual-level ICE arrest, detainer, detention, encounter and removal files — plus codebooks and processed extracts — are available for bulk download from the Deportation Data Project (which republishes ICE FOIA data) and from ICE/OHSS portals (with dashboards and spreadsheets described below) [4] [5] [3].

1. How “criminal” shows up in ICE files: a records-based flag, not a legal taxonomy

ICE datasets typically indicate criminality through fields that record whether an individual has a prior conviction or a pending charge in ICE’s system of record — for example a yes/no “convicted” variable or “apprehension criminality” codes — meaning the dataset reflects what ICE documented, not an independent legal classification of culpability or offense gravity [1] [2] [6]. Agency guidance and outside explainers stress that “non‑criminal” in the tables usually means “no prior criminal convictions recorded in ICE’s data” and does not necessarily capture uncoded misdemeanors, crimes prosecuted elsewhere, or offense details absent from the dataset [6] [7].

2. Administrative arrests versus criminal arrests: two different regimes in the data

Public-facing analyses emphasize that many ICE “arrests” are administrative (civil) enforcement actions for immigration violations rather than criminal arrests for new crimes; ICE’s own materials and expert guides note administrative arrests are for civil immigration violations even if the person has a criminal history listed in ICE records [8] [3]. The statute and ICE operational practice also allow for criminal arrest authority in limited circumstances (e.g., when an officer has reason to believe a felony was committed), but the dataset does not always distinguish arrest legal basis cleanly to outside analysts [8].

3. What variables are available and what they mean (practical guide for users)

Deportation Data Project hosts the original ICE tables and a codebook that describes fields like arrests/apprehensions, detainers, detentions (book‑ins/book‑outs), encounters and removals, and it posts a simplified “one row per stay” processed file plus documentation on how fields were standardized [4] [2]. DHS/OHSS detentions data similarly include a conviction indicator and other metadata (e.g., arrest location), but both sources—and analysts who have worked with them—warn that missingness, changed terminology across releases, and counting conventions (e.g., multiple book‑ins per person) complicate interpretation [1] [9].

4. Where to get the raw datasets and codebooks (direct leads)

The Deportation Data Project republishes the full FOIA releases and offers a ZIP of raw ICE files (historical 2012–2023 and later releases), processed extracts, and a codebook for field definitions; those are the central, researcher‑friendly access point for individual‑level ICE data [4] [5]. ICE’s own Statistics pages and the Office of Homeland Security Statistics (OHSS) provide official ERO/Detention dashboards and spreadsheets and are the original government sources for many aggregated tables [3] [1].

5. Linking records and practical limitations for classifying criminality

The linked ICE datasets sometimes include unique IDs that allow tracking a person across arrests, detainers, detentions and removals, enabling researchers to trace enforcement pathways — but the data are siloed from immigration court (EOIR) and CBP datasets and contain gaps and missing identifiers that can prevent clean merges and obscure which arrests correspond to which court outcomes [9] [2]. Several independent analysts and organizations note that the datasets do not always disclose offense type or granular charge information and that some categories (e.g., traffic offenses) may be inconsistently coded across jurisdictions [7] [6].

6. How analysts interpret “criminal” shares and the debate around them

Recent analyses using Deportation Data Project releases have found large shares of ICE detainees without convictions recorded in ICE data and spurred debate — some policy groups argue this demonstrates sweeping enforcement of non‑criminal immigration violations, while others caution that the dataset limitations and differences between administrative/custodial pathways complicate claims about who is “criminal” [10] [7] [11]. The data do provide a basis for comparing counts and trends, but credible interpretations require close attention to codebooks, missing fields, and the distinction between “no conviction in ICE records” and “no crime committed” [4] [2] [8].

7. Bottom line for researchers and journalists

“Criminal” in ICE’s arrest and detention data is a records‑driven flag (conviction/pending charge fields and coded criminality descriptors) rather than a standardized legal severity scale, and the best route to inspect the raw fields is to download the ICE original FOIA releases and processed files from the Deportation Data Project and to consult ICE/OHSS official tables and codebooks for definitions and caveats [4] [5] [3]. Any rigorous analysis must document how the dataset defines criminality, address missingness and siloing (EOIR/CBP), and avoid equating “no conviction recorded” with an absence of offense without corroborating sources [2] [9] [8].

Want to dive deeper?
How do ICE ‘detainer’ records work and where can detainer request data be downloaded?
What are the limitations when linking ICE datasets to EOIR immigration court records, and where are EOIR raw datasets hosted?
How do researchers validate ICE conviction flags against local criminal court records to measure coding errors?