Which UK police forces have released their full FOI datasets on arrests linked to online speech, and how do their methodologies differ?
Executive summary
Publicly available sources show that many UK forces supply arrest and crime data to central portals and that at least the Metropolitan Police publish full custody/arrest datasets locally, but there is no single authoritative list in the provided reporting that names every force that has released a full Freedom of Information (FOI) dataset specifically labelled “arrests linked to online speech.” Data publication practices are fragmented: national portals use standard monthly feeds from most forces (43 geographic forces plus others) [1], the Met publishes full datasets via the London Datastore [2], and FOI-driven media projects have obtained partial replies from the majority of forces but report different tallies and coverage [3] [4].
1. What counts as “full FOI datasets” and where data is already published
The question hinges on definition: “full FOI datasets on arrests linked to online speech” could mean either (a) forces proactively publishing arrest-level custody or offence data that can be queried for online-communication offences, or (b) the specific datasets returned to journalists under FOI requests; the national police open data portal collects standardised monthly crime, outcome and arrest-type files from 43 geographic forces plus British Transport Police and PSNI, indicating a baseline of proactive publication by most forces [1] [5]. The Metropolitan Police additionally directs users to full custody and crime datasets on the London Datastore, showing an example of a force publishing comprehensive datasets that researchers can use to isolate online-speech offences [2].
2. What journalistic FOI projects have actually obtained and what that reveals
Two media threads demonstrate the FOI route: a Daily Mail project reported 39 of 45 forces replied to FOI requests about arrests under Communications Act/Malicious Communications Act charges, claiming around 9,700 arrests in a year [3], while an independent Substack analysis asserted 37 of 45 forces supplied data and estimated over 12,000 arrests for online communications offences [4]. These differences expose how FOI-driven tallies can diverge — through different question wording, date ranges, inclusion/exclusion of certain offence codes, or simple non-response — and underline that media FOIs do not equate to the same thing as regular public dataset publication [3] [4].
3. How methodologies differ between forces and data sources
Methodological divergence shows up at three levels: data collection, classification and sharing. First, forces feed standardised monthly files into data.police.uk, but the portal’s aggregation replaces precise locations with mapped points and applies privacy rules, which affects granularity [5] [1]. Second, offence classification and what gets counted as “online speech” vary: some FOI replies counted arrests under specific statutory sections (e.g., s127 Communications Act, s1 Malicious Communications), while others may have searched local intelligence systems or custody logs for broader keywords, producing inconsistent denominators [3] [4]. Third, internal policing tools such as the Police National Database (PND) are used differently across forces — academic work shows wide variation in PND usage and therefore in how quickly and on what basis social-media traces turn into arrests [6].
4. Why discrepancies and gaps persist — and the incentives behind them
Discrepancies stem from differing operational definitions, FOI compliance variability, and data-protection/privacy filtering applied before publication [1]. Media projects have incentives to produce dramatic national tallies and may frame non-responses as bad faith; conversely, police forces have institutional incentives to avoid releasing raw intelligence or personally identifying custody-level details, which can lead to partial responses or aggregate statistics instead [3] [4] [1]. Academic research also warns that national intelligence platforms like the PND magnify local practice differences when identifying suspects from social media traces [6].
5. Bottom line for researchers and readers
The supplied reporting allows confident statements that most forces supply standard crime and arrest files to the national portal (43 geographic forces + BTP + PSNI) [1] [5] and that the Metropolitan Police makes full datasets available via the London Datastore [2]; beyond that, FOI-based national tallies exist but vary by outlet and methodology [3] [4]. The materials provided do not include a definitive, source-backed list of every force that has proactively released a “full FOI dataset” explicitly labelled as arrests for online speech, so any comprehensive naming of forces would require checking each force’s published data feed and the raw FOI responses not included in these sources.