How do databases define and verify 'unarmed' in police shooting statistics?
Executive summary
Databases that track “unarmed” in police shootings rely on non-governmental compilation and local reporting; major trackers like The Washington Post and Mapping Police Violence label armed status using media, police statements and open-source sleuthing rather than a single legal standard [1] [2]. Critics say those classifications can conflate threat behavior with possession of a weapon — producing disputed counts of “unarmed” cases — and academic users warn that differences in definitions drive varying results across datasets [3] [4].
1. How the big non‑governmental trackers decide “unarmed” — aggregation, not adjudication
Leading public datasets do not rely on a single official federal field; they compile incidents from local news, police releases, social media and other open sources and then code whether the decedent had a weapon — or was described as unarmed — based on that reporting (The Washington Post’s Fatal Force methodology and Mapping Police Violence’s stated approach) [1] [2]. Researchers using these repositories note that “armed” is typically recorded as having any weapon, and “unarmed” as lacking any identified weapon in the available reporting [4] [1].
2. Why classification is contentious — differing concepts of “weapon” and “threat”
Commentators and fact‑checkers show that datasets sometimes call a person “unarmed” even when reports describe threatening movements, use of vehicles, toys, or fists — distinctions that change the public meaning of “unarmed.” A City Journal critique of the Washington Post database points out incidents coded “unarmed” where reporting described attacks, threatening movements, or visible non‑gun weapons, arguing that omitting those threat‑type details can mislead users [3]. Academic reviews likewise note that “armed” is conventionally defined as possession of any weapon, but that threat behaviors are coded separately and inconsistently [4].
3. The practical process of verification — human coding and cross‑checks
Databases commonly use trained researchers who read multiple sources, cross‑reference independent repositories, and update entries as new information emerges; the Washington Post, for example, validates circumstances with local reporting, police websites and other open databases and continues updating past entries when facts change [1] [5]. University research projects and public‑health teams have replicated this labor‑intensive approach: manual reviews by trained coders are central to assigning armed status in nonfederal datasets [6] [5].
4. Limits of official federal data and why civilians’ projects filled the gap
The FBI and other federal collections historically undercount or lack consistent fields for use-of-force and armed status, so journalists and researchers created independent repositories to provide more complete incident‑level data [1] [7]. Studies and watchdog reporting show federal systems do not force local agencies to report all incidents, which leaves independent datasets as the most complete public source for “unarmed” tallies [1] [7].
5. How researchers handle ambiguity and disagreement in coding
Academics acknowledge and try to adjust for dataset limitations: peer‑reviewed studies using Washington Post data restrict samples to cases with “known armed status” and discuss residual uncertainty; they also use sensitivity analyses to test how different coding choices affect racial disparity estimates [5] [4]. Critics, however, argue that some apparent disparities shrink when contextual threat‑type details are considered, illustrating how definitional choices materially affect conclusions [3] [4].
6. Consequences for reporting, policy and public debate
Because trackers differ in whether and how they count toy weapons, vehicles used as weapons, or aggressive movements, public claims about how many people were “unarmed” vary widely across outlets and advocacy groups — producing both policy pressure (e.g., state investigations in California of fatal shootings of unarmed people) and pushback from critics who question dataset presentation [8] [3]. Advocacy groups and policy reports often use the “unarmed” label to highlight racial disparities, while methodological critiques warn that oversimplified labels can be weaponized to inflame rather than inform [9] [3].
7. What the sources don’t settle — and what to watch for in the data
Available sources do not present a single authoritative legal definition of “unarmed” applied across jurisdictions; instead, definitions come from how each database codes incidents and from the underlying reporting they rely on (not found in current reporting). Users should look for datasets’ coding guides, whether they document “threat type,” and how they treat ambiguous cases; those metadata items determine whether an incident appears in an “unarmed” count [1] [3] [4].
Summary takeaways: independent databases label “unarmed” based on media, police and open‑source evidence, not unified legal criteria [1] [2]; coding choices about toys, vehicles and threatening movement drive disputes [3]; and researchers hedge results by restricting to known status or testing alternative codings to reveal how sensitive findings are to definition [5] [4].