What steps are required to recreate FEC and Senate disclosure data into a country-by-senator contributions table?
Executive summary
Recreating a country-by-senator contributions table requires assembling two public but distinct datasets — FEC campaign finance records and Senate public financial/lobbying disclosures — and then engineering a reconciliation pipeline that links contributors (and their addresses or employer strings) to national jurisdictions and to senators’ committees or campaigns; the necessary data access points, formats, and legal caveats are documented by the FEC and the Senate Office of Public Records [1] [2] [3]. Several practical limitations must be anticipated: Senate filings were handled differently before 2018 and may be incomplete in FEC digital image sets, contributor “country” is not a standard canonical field, and ethics disclosures differ from FEC campaign reports [4] [3] [5].
1. Identify and retrieve authoritative source files from the FEC and the Senate
Start by cataloging the machine-readable FEC campaign datasets and committee reports accessible from the FEC’s Browse Data and reports endpoints, which include individual contribution records and raw committee filings [1] [6], and then add Senate public records from the Senate Office of Public Records — including the Public Financial Disclosure database and the eFD filing system required by the STOCK Act [2] [5]; acknowledge that historically Senate reports were filed to the Secretary of the Senate and, before 2018, often arrived as microfilm rather than the FEC’s digital images [4] [3].
2. Define the table schema and provenance rules
Design a country-by-senator table schema that records senator identifier, reporting period, contributor identifier, contributor raw address/employer, inferred country, contribution amount, contribution type (individual, PAC, corporate), source file, and source record ID; this schema must flag whether the contribution comes from a committee report filed to the FEC or from Senate public disclosures because the two report different scopes and have independent filing rules [5] [3].
3. Parse and normalize contributor identity and location fields
Extract contributor name, address, employer, and occupation fields from FEC contribution records and related committee reports, and from Senate disclosures where relevant, then run normalization and entity-resolution to collapse variants of the same donor; since neither the FEC nor Senate disclosures guarantee a standardized country code field, convert U.S./foreign addresses to countries via geocoding and by interpreting employer/organization affiliations when addresses are missing [1] [6] [2].
4. Map contributions to senators and committees, respecting filing pathways
Link each contribution record to the relevant senator by mapping committee IDs, candidate committees, or the Secretary of the Senate filings; remember that Senate committees historically filed with the Secretary of the Senate and only later began filing electronically in parity with other filers, so reconciliation must handle the two ingestion streams [4] [3]. Use FEC committee metadata to attribute transfers and committee support where direct candidate linkage is missing [7] [3].
5. Aggregate by country and handle special cases
Aggregate amounts by inferred contributor country and by senator, separating contributions by legal type (individuals vs. PACs) and flagging entities with ambiguous jurisdiction (global funds, PO boxes, corporations with multinational HQs); because campaign finance rules and disclosure forms differ, report aggregated totals alongside counts of unverifiable or ambiguous records to preserve transparency [3] [1].
6. Validate, document, and disclose methodological limits
Validate results with spot checks against the original FEC and Senate disclosure images and document known gaps: Senate financial disclosures are maintained in the Office of Public Records and may be limited in historical depth (six years for some items) while candidate reports have separate retention rules; note explicitly that neither the FEC nor Senate guarantees a native “country” field and that pre-2018 Senate filings may be on microfilm or otherwise require extra OCR/cleanup [2] [8] [4] [3].
7. Maintain updates, provenance, and ethics safeguards
Put the pipeline on a repeatable schedule keyed to FEC and Senate update cycles, preserve original file IDs and document each transformation for auditability, and treat foreign-sourced contribution labels with caution because legal definitions and reporting requirements differ between FEC filings and Senate financial disclosures [5] [3].