Which databases track sexual misconduct allegations against state and federal lawmakers, and how do they define inclusion criteria?
Executive summary
Three kinds of trackers dominate reporting on sexual-misconduct allegations against U.S. lawmakers: journalist-curated databases (notably The Associated Press’s state-lawmakers tally), civics data efforts such as GovTrack’s “Legislator Misconduct” database, and advocacy or research projects like the National Women’s Defense League (NWDL) report; each declares different scopes and relies largely on public reporting, which leaves significant gaps from confidential settlements and internal congressional or legislative processes [1] [2] [3]. Crowd-maintained compilations such as Wikipedia’s list of federal sex scandals also function as de facto databases, but they inherit limits of inconsistent sourcing and editorial standards [4].
1. The AP state-lawmakers database — scope, method and what it includes
The Associated Press has cataloged allegations against state lawmakers since 2017, compiling at least 147 accused across 44 states in its most recent update and focusing explicitly on publicly reported allegations that meet journalistic standards of verification, including news stories of complaints, investigations, resignations and settlements [1] [5]. The AP’s public tally emphasizes incidents that entered the public record during and after the #MeToo moment, counts outcomes such as resignations or expulsions, and treats settlements and administrative actions as part of its dataset when those actions are reported publicly [1] [6].
2. GovTrack’s Legislator Misconduct database — federal focus and mixed categories
GovTrack’s misconduct tracker aggregates Congressional misconduct items — criminal charges, ethics investigations, committee findings and known settlements — and places them under broader “standards of conduct” entries; it documents when congressional Ethics Committees find or do not find evidence on particular claims and records related legal actions, but does not purport to be exhaustive of confidential settlements or off-the-record complaints [2]. Its inclusion criteria are functionally tied to public records, official filings and documented committee actions, which means matters quietly settled through nondisclosure or internal processes can be omitted [2].
3. NWDL and advocacy-led reports — broader time windows, narrower subjects
Advocacy and research groups like the National Women’s Defense League have produced reports that span longer windows and claim larger tallies — for example, tracking roughly 400 allegations against 145 sitting state lawmakers between 2013 and 2024 — but those projects often limit scope to publicly disclosed allegations against elected officials (not staff) and are explicit about undercounting because of underreporting and confidential settlements [3] [7]. These reports typically aim to document systemic patterns and thus include historical claims and policy analysis while noting methodological limits such as omission of staff allegations and sealed settlements [3].
4. Crowd-sourced lists and historical compilations — usefulness and caveats
Public, crowd-edited resources like Wikipedia’s “List of federal political sex scandals” aggregate decades of cases and are useful for historical breadth, but they combine incidents with varying degrees of documentation — from criminal convictions to contested accusations — and therefore require cross-checking with primary reporting or official records to determine how each entry met inclusion thresholds [4]. These compilations can be helpful starting points but are neither uniform in criteria nor always transparent about which allegations were substantiated, investigated, or settled.
5. Common inclusion rules — what unites these databases and where they fall short
Across these trackers the common inclusion driver is public reporting: allegations enter a database when they are reported by media, disclosed in official records, or revealed in committee findings, and databases then code outcomes like investigation, settlement, resignation, or exoneration; many explicitly acknowledge that confidential settlements and underreporting mean their counts are conservative estimates [1] [2] [3]. That reliance on public sources produces predictable biases — spikes around news cycles such as #MeToo, heavier documentation of higher-profile officials, and uneven state-by-state visibility where legislative self-governance and sealed settlements obscure the full picture [1] [3].
6. Reading the trackers critically — politics, gaps and next steps
Users should treat each database as answering a specific question — AP for recent statehouse allegations with journalistic vetting, GovTrack for congressional misconduct tied to public legal or ethics records, NWDL for advocacy-driven, decade-scale mapping, and Wikipedia for historical aggregation — while recognizing that none fully captures confidential settlement terms, internal complaints not made public, or the totality of staff-targeted misconduct unless explicitly stated [1] [2] [3] [4]. Where databases differ, motives and audiences matter: newsrooms prioritize verification and timeliness, advocacy groups stress systemic prevalence, and civic-data sites prioritize linkage to formal public records, all of which shape what is counted and what remains invisible [1] [3] [2].