Most government spending data is public. Most of it goes unanalyzed. We built automated anomaly detection across state checkbook portals in five states and found patterns that agencies can act on well before annual audit cycles catch up.
The Open Data Advantage
A growing number of U.S. states publish detailed expenditure data through transparency portals. Connecticut, Virginia, and Florida publish invoice-level payment records. North Carolina and Texas publish at varying levels of aggregation. This data is intended to serve public accountability, but it has a secondary use that few organizations have explored: proactive anomaly detection.
When you can see every payment a state agency makes to every vendor over a multi-year period, certain patterns become visible that no single invoice review could surface.
Five Anomaly Categories
Our analysis focuses on five categories of statistical anomaly, each grounded in established audit methodology:
Duplicate payments. We identify payments to the same vendor, for the same amount, within a narrow time window. Exact-match duplicates are the simplest case. More interesting are near-duplicates: amounts within 1% of each other, invoiced days apart, which may indicate a vendor resubmitting a rejected invoice with minor modifications.
In Virginia's open data, we identified over 300 potential duplicate payment pairs across state agencies in the most recent fiscal year. The aggregate dollar exposure exceeded $40 million. Many of these will have legitimate explanations - recurring service contracts with identical monthly amounts, for example. But the ones that don't represent real recovery opportunities.
Vendor concentration risk. When a single vendor captures more than 40% of an agency's total spend in a given category, it raises questions about competitive procurement. High concentration isn't inherently problematic - some services have limited provider pools. But it is a reliable indicator of where contract oversight should be most rigorous, because the financial exposure is concentrated.
Year-end spending spikes. The "use it or lose it" dynamic in government budgeting creates a well-documented pattern: agencies accelerate spending in the final quarter of the fiscal year to avoid budget reductions. Our analysis quantifies this by comparing monthly spending patterns against the annual average. Agencies with final-quarter spending that exceeds 2x their monthly average warrant closer invoice-level review during those periods.
Benford's Law deviations. The first digits of naturally occurring financial data follow a predictable distribution known as Benford's Law. Significant deviations from this distribution in an agency's payment records can indicate anomalies worth investigating. This technique has been used in forensic accounting for decades and translates directly to automated analysis of public data.
Round-number clustering. An unusually high proportion of payments at round numbers (exactly $10,000, $25,000, $50,000) can indicate invoices being structured to stay below review thresholds. Some agencies require additional approvals for payments above certain thresholds, creating an incentive to split or round invoices.
From Detection to Action
The value of anomaly detection depends entirely on what happens after a pattern is identified. A list of 6,000 anomalies is not useful. What agencies need is prioritization: which anomalies have the highest dollar exposure, which overlap with existing audit findings, and which are in agencies where the contract management staffing is thinnest.
We address this by integrating anomaly signals with audit report findings and contract characteristics. An agency that both shows vendor concentration anomalies in its spending data and has recent audit findings about sole-source justification weaknesses is a much higher priority than one with spending anomalies alone.
What This Means for Government Finance Leaders
Public checkbook data is already published. The cost of analyzing it is marginal. Agencies that integrate automated anomaly screening into their pre-payment workflows can shift from a reactive posture - waiting for auditors to find problems after the fact - to a proactive one that catches patterns before they compound into material findings.
The shift doesn't require new technology procurement in every case. Many anomaly checks can be implemented in existing business intelligence tools with moderate configuration effort. What it requires is a willingness to look at the data with fresh eyes and a framework for prioritizing what the analysis surfaces.