Methodology

SpendLedger republishes official government spending records in a consistent, browsable form. This page describes exactly how the data gets from a government portal to a page on this site, what we do to it along the way, and what we deliberately leave alone.

Collection

We ingest official bulk data published by each government's own transparency portal (see data sources). We prefer bulk downloads and official APIs over scraping, and we record the source, fetch time, and raw record for every row so each published figure traces back to the government record it came from. Loads are append-oriented: when a source restates a record, we keep the prior version with effective dates rather than overwriting history.

Normalization

Vendor and agency names are normalized with deterministic rules only: case folding, punctuation and whitespace cleanup, standardized legal suffixes (INC, LLC, CORP), and separation of "doing business as" clauses. We do not use fuzzy or probabilistic matching, and we do not merge similar-looking vendors across sources. The original name, exactly as the government published it, is what appears on each profile. A vendor profile therefore reflects one payee name in one jurisdiction's records. If a company is paid under several spellings, those remain separate profiles until the underlying records justify grouping them.

What we publish, and what we exclude

We publish profile pages only for entities with enough underlying data to be useful. Entities below that floor exist in our database but get no page. Payees masked by the source for privacy (for example Ohio's "Masked Payee" designation covering payments to individuals) are never published as profiles, though their dollars still appear in agency totals because they are real spending.

We also suppress more than the source requires. Government bulk files sometimes name individual people (grant recipients, care providers, refund claimants) whom the same government's own search tool deliberately masks. Where payment patterns indicate an individual rather than a business, we withhold the name entirely: no profile, no appearance in any list, exports included. Their dollars remain in agency totals. We would rather under-publish names than out-disclose a state's own privacy policy.

Figures

Dollar amounts, dates, agencies, and category codes are shown as reported by the source portal. We compute rollups (totals, counts, fiscal-year summaries, rankings) from those records; we never estimate, impute, or model missing values. Fiscal years follow each jurisdiction's own fiscal calendar.

Update cadence and corrections

Data refreshes when the source portal publishes new records; every page shows its dataset's refresh date (currently Jul 4, 2026). Because source portals correct their own data over time, small historical revisions are normal. If you believe a figure is wrong, check the source portal first; if it disagrees with us, we want to know.

Known limitations

Coverage begins when the source portal's usable bulk data begins, not when the government began spending. Payment records are not contracts: a payment's category and agency reflect accounting codes, not necessarily the program that benefited. Vendor totals can understate a company's true footprint when it is paid under multiple names or through intermediaries.