Methodology
How the data is processed
A factual description of how records get from official sources into this platform — and the limitations users should keep in mind.
Source ingestion
Records are pulled from official open-data endpoints — primarily the Oireachtas Open Data API for politicians and divisions, and CKAN-published procurement datasets for contract awards. Each ingestion run is paginated, rate-limited and resumable.
Snapshots
Every successful fetch is recorded as a snapshot row containing the endpoint, the retrieval timestamp, the ingestion status, and a reference to the raw payload as it was returned by the source. Snapshots are immutable.
Checksums
Each snapshot stores a SHA-256 checksum of the raw payload. Re-fetches with the same checksum are treated as unchanged; differing checksums trigger downstream upserts and audit-log entries.
Schema mapping
Procurement datasets are published with inconsistent column headers across years and quarters. Each dataset's columns are detected, normalised, and mapped to a canonical field set. The mapping for each dataset version is stored as a schema signature so historical loads remain reproducible.
Columns we cannot confidently map are recorded as unmapped_columns and surfaced in admin tools rather than silently dropped.
Verification status
Every record carries a verification badge:
- source_verified — keyed to a stable source identifier (e.g. an Oireachtas member code or a contract reference).
- needs_review — could not be uniquely keyed to a source identifier and is queued for manual review.
Confidence scores are derived from how the record was matched to its source — never from inferred content.
Party-majority comparison
For each recorded division, we compute the majority vote cast by each party. A member's cast is then compared to that majority. Where the two differ, the record is labelled exactly "Voted against party majority".
This is a purely arithmetic comparison of vote casts published by the Oireachtas. It implies nothing about motivation, dissent, rebellion or party discipline. Members absent or abstaining are not counted as voting against their party.
Procurement data limitations
- Coverage is limited to contracts that authorities have chosen to publish as open data.
- Contract values are published inconsistently — many records carry no value at all.
- CPV (procurement category) codes are missing or partial in many datasets.
- Award dates are sometimes derived from the dataset reporting period rather than the actual award date; this is recorded explicitly in the award_date_status field.
- Supplier names appear in different forms across datasets; matching is conservative and unmatched suppliers remain visible.
Missing source fields
Where a source field is blank, the platform displays it as blank. We do not infer, estimate, interpolate, or carry values forward from related records. A missing field reflects a limitation of the originating dataset, not of the platform.