How did The New York Times count references to Trump in the Epstein files and what keywords did they include?

Checked on February 4, 2026
Disclaimer: Factually can make mistakes. Please verify important information or breaking news. Learn more.

Executive summary

The New York Times identified more than 5,300 documents that contained over 38,000 references to President Donald Trump in the Justice Department’s latest Epstein file release by running the corpus through a proprietary search tool and searching for Mr. Trump’s name alongside “related words and phrases” such as his wife and Mar‑a‑Lago, according to the paper’s reporting [1] [2]. The Times also emphasized that many of those hits were routine media clippings, unverified tips or other third‑party material placed in the files, and it did not publish a full keyword list or the internal search parameters for its tool [3] [4] [5].

1. How the Times counted mentions: a proprietary search across millions of pages

The Times described its methodology as using “a proprietary search tool” to query the Justice Department’s released pages and then tally documents and individual references that matched its search terms, yielding a count of more than 5,300 files and more than 38,000 references to Mr. Trump and related terms in the latest tranche [1] [2]. The paper’s language makes clear the work was an automated keyword search rather than a manual read‑through of every page, and the resulting figures were presented as the number of files containing matches and the total number of matched references within those files [1] [2].

2. What keywords and phrases the Times said it included

Reporting explicitly states that the Times’ search looked for “Mr. Trump, his wife, his Mar‑a‑Lago club in Florida, and other related words and phrases,” which the paper cited as the terms driving its counts [1] [5]. Other Times pieces and summaries reiterate that the references encompass Trump’s name, family members and linked locations (Mar‑a‑Lago), though the Times did not publish a comprehensive list of every keyword or phrase used in its proprietary queries [2] [3].

3. What kinds of documents produced the hits — context matters

The Times reported that a majority of those hits came from media reports, news clippings and other materials that had been collected into Epstein’s files — not necessarily from original investigatory evidence or direct communications between Epstein and Trump — and that some of the matches were plainly unverified tips submitted to law enforcement [3] [4] [6]. The paper cautioned that many references were contextual (for example, articles or emails mentioning Trump) rather than contemporaneous records of meetings or allegations, which affects how the raw counts should be interpreted [7] [3].

4. Limits of interpretation and what the Times did not disclose

The Times’ reporting stopped short of releasing the exact search terms, exclusion rules, proximity thresholds or de‑duplication rules that would allow independent verification of the 5,300/38,000 figures; the description of a “proprietary” tool implies internal settings and filters that the newsroom did not make public [1] [2] [5]. Because the underlying methodology and keyword list were not published, outside observers cannot fully reproduce the counts or know precisely which “related words and phrases” were included [2] [5].

5. Conflicting counts, official caveats and political context

Other Times coverage and live updates showed slightly different snapshots — for example an “initial review” cited roughly 4,500 documents mentioning Trump in an earlier posting — underscoring that counts can shift as reporters refine queries or as different subsets of documents are tallied [6] [7]. The Justice Department and officials warned that the released corpus contains unverified or false claims and that some material had been removed or redacted over victim‑protection concerns, a caveat the Times repeated while reporting both the raw counts and the qualitative context [4] [3]. Political actors on both sides have incentives to amplify either the numerical totals or the cautions, a reality the Times’ own coverage acknowledged in laying out how the files were used and discussed publicly [1] [8].

Want to dive deeper?
What specific document types in the Epstein release (emails, tips, news clippings) accounted for the largest share of Trump mentions?
Has any news organization published a reproducible keyword list and methodology for counting references in the Epstein files?
How have Justice Department redactions and removals changed the count of documents mentioning Trump over successive releases?