How can I search the DOJ Epstein Data Sets for a specific EFTA file number?
Executive summary
The DOJ’s Epstein repository names each released document with an EFTA file number and hosts them in dataset-specific directories on the DOJ site, so locating a specific EFTA file number can be done either by navigating the Department of Justice dataset pages and opening the matching PDF or by using third‑party searchable compilations that index those filenames [1] [2] [3]. Independent aggregates and search tools can make that easier, but researchers should note the DOJ has released datasets in many parts, sometimes removed items, and warns the corpus may contain sensitive material requiring contact to the DOJ for corrections [4] [3] [2].
1. Know what an “EFTA” number looks like and where it lives on the DOJ site
Each released file is given a sequential EFTA identifier embedded in the DOJ file name (for example, files appear as EFTA00025010.pdf and EFTA00020493.pdf in dataset folders), and those PDFs are organized by Data Set pages on the DOJ Epstein hub [5] [6] [2]. The main DOJ Epstein landing page and the “DOJ Disclosures” pages list Data Set links that serve as the entry points to the file directories where the EFTA PDFs are stored [1] [7] [2].
2. Direct URL technique: build the file path if the number is known
When the exact EFTA number is known, a reliable method is to use the pattern observed on the DOJ site and Data Set pages—navigate to the relevant Data Set URL and append the filename (for example the DOJ hosts files under paths like /epstein/files/DataSet+8/EFTA00025010.pdf) to load the document directly [5] [2]. That direct‑URL approach bypasses slow indexes and works because the DOJ published thousands of individual PDFs with their EFTA filenames visible in those dataset directories [3].
3. Use the DOJ Data Set landing pages when the dataset is unknown
If the dataset containing an EFTA number is not known, consult the DOJ’s Data Set index pages (Data Set 1, 4, 8, 12, etc.) where documents are grouped and available for browsing; those pages are the authoritative place the DOJ uses to publish released files [2] [8] [9] [10]. Each dataset page contains the files for that batch and the DOJ warns that, because of volume and sensitive content, members of the public should notify EFTA@usdoj.gov of any inadvertent disclosures—an important procedural detail when encountering problematic material [2].
4. Faster alternatives: searchable third‑party archives and tools
Independent projects and newsrooms have compiled the DOJ dumps into searchable databases or combined archives to ease navigation—examples include Google’s Pinpoint collection retaining DOJ releases and an Internet Archive compiled release that groups EFTA ranges into downloadable sets [4] [3]. Those third‑party tools often index filenames and full text so a specific EFTA number can be found with a site search or a filename lookup far faster than clicking through thousands of PDFs [4] [3].
5. Caveats: deletions, partial releases and the limits of public indexing
Researchers should be aware that the DOJ has been releasing files in waves and that the universe of potentially responsive pages is larger than what’s been published; oversight reporting notes millions of pages were identified and only a subset released, and some items previously posted have been removed, which complicates completeness guarantees for any particular EFTA search [11] [4]. When a direct search fails, one should check alternate dataset pages, third‑party compilations, the Internet Archive mirror, and contemporaneous news indexes—while recognizing that DOJ removals or withheld/redacted files may mean an EFTA number was never published or has been retracted [3] [11] [4].