What this covers
Every public record currently surfaced by the AR SOS across six domains, searchable from a single place with cross-references between the domains.
- ·Campaign Finance — candidates, committees, PACs, contributions, expenditures (Arkansas Ethics Commission ReFrame system).
- ·Business Entities — corporations, LLCs, partnerships, fictitious names, registered agents, franchise-tax status (sos-corp-search.ark.org).
- ·UCC Filings — financing statements, amendments, terminations, debtor/secured-party lookup. Ingestion pending the May 5-11 platform migration.
- ·Lobbyist & PAC — registered lobbyists, firms, clients, expenditures, entities lobbied (Arkansas Ethics Commission).
- ·Statement of Financial Interest — public officials, General Assembly members, state employees reporting extra income.
- ·People & Companies — unified cross-referenced profiles built from rules-based entity resolution across all five domains above.
Data sources
- ·ReFrame (Arkansas Ethics Commission)— public JSON API behind
api-ethics-disclosures.sos.arkansas.gov. Candidates, committees, contributions, expenditures, lobbyists, SFIs. - ·Arkansas SOS Corporate Search—
sos-corp-search.ark.org/corps. Laravel application; POST with CSRF token returns HTML result table. - ·Follow the Money (FTM)— 2015-2020 aggregate totals and 2023-2024 election outcomes for AR candidates, merged into the campaign finance domain where a match exists.
- ·Legacy UCC (sos-ucc.ark.org)— behind the Arkansas.gov INA paid subscription. Ingestion blocked until the replacement public portal goes live in May 2026.
Refresh cadence
This is a pitch-stage build. Scrapes are run manually on demand. In production, the scrapers would run nightly against each source; the schema and scripts are already built for that cadence — only the scheduler needs wiring (GitHub Actions or Vercel Cron).
Data access & coverage caveats
Public-facing search forms impose limits that keep the portal from holding a 100% snapshot of every disclosure. A production rollout with the AR SOS office would use privileged bulk feeds to close these gaps.
- ·Business Entities — partial. The AR corporate registry holds roughly 500,000 entities. The public search form silently rejects any query returning 250 or more matches, so we harvest by recursive alphabetic prefix walk. Full coverage is available via the Tyler/INA bulk-data FTP feed (paid subscription, approx. $50-200/mo) or a one-time bulk records request to the SOS office (corprequest@sos.arkansas.gov, 501-682-1010).
- ·UCC filings — blocked. The legacy UCC search sits behind the Arkansas.gov INA username/password paywall (no public debtor search). Waiting for the new public portal launching May 11, 2026.
- ·Campaign finance detail — partial. Candidate/committee summaries and FTM historical outcomes are complete. Individual contribution and expenditure rows are a ~5% sample of the full ReFrame firehose (around 2,000 contributions and 5,000 expenditures loaded of ~1.13M and ~74K on file); a full backfill is straightforward but deferred.
- ·SFI PDFs — not parsed. Statement-of-Financial-Interest filings are indexed by filer and year but individual income-source line items inside each PDF are not yet extracted.
The AR SOS office already owns all of this data. A production engagement would grant FTP/API access directly to the portal and eliminate every caveat above.
Entity resolution
Rules-based name normalization across all domains: lowercase, strip titles (Judge, Governor, Attorney General, etc.), handle “Last, First” reversal, collapse whitespace. Exact normalized matches are auto-merged into a canonical entity; near-duplicate pairs (same last name + same first initial) are logged to a review table for later human confirmation.
The cross-referenced People & Companies index surfaces entities that appear in two or more domains — for example, a sitting legislator who is also a registered lobbyist, or a campaign donor who is an officer of an AR-registered LLC.
Pitch value proposition
- ·One public search box for every SOS disclosure — no more jumping between six different websites.
- ·Relationship visibility — who donates to whom, who lobbies whom, who sits on which board.
- ·Ready for mobile, screen readers, and deep-linking to individual filings.
- ·Built on open standards (Next.js, Postgres, standard REST scraping) — zero vendor lock-in.
- ·The scraping schema already accommodates the new UCC platform whenever it goes live.