Audit Engine is a cloud-based service which will perform a full-ballot-image audit for any election that produces ballot images.
We hope that this service will become the go-to standard for fully reviewing any election conducted on hand-marked paper ballots and then scanned by popular equipment.
Initialy, Audit Engine will support ES&S and Dominion ballots and election data.
How It Works
Getting the data together
First, the data must be assembled for use by Audit Engine. Most of this is already produced and available by Election Officials.
Ballot Image Creation:
Modern voting equipment used in most districts in the U.S. create ballot images as they process the ballots.
It is essential that this equipment be set so the ballot images are not deleted after they are used to extract the vote.
These images are then transferred to the Election Management System (EMS)
The EMS can export the ballot images for use by Audit Engine. These should be placed into ZIP archives with about 30,000 to 50,000 ballots per archive.
Election Information File (EIF):
Our system requires additional information regarding the information actually printed on the ballots.
The EIF lists
all contest names, as used in the CVR, as printed on hand-marked paper ballots, and as printed on BMD ballots
the options in each contests, the full-text descriptions of yes/no contests.
the number of write-ins, and the vote-for number.
This information is available from sample ballots and can be assembled prior to the election.
This is a spreadsheet file in .xlsx or .csv format.
The exact format of this file is provided below.
In rare case, we also need the Ballot Options File (BOF) which lists the actual text of each option if they differ from the official CVR text.
Cast Vote Record (CVR):
Officials can produce cast-vote-record files (CVR) which lists the results of their interpretation of the ballots.
Audit Engine currently supports the ES&S format and the Dominion format.
If any ballots are suppressed for privacy reasons, the total number of ballots suppressed and their vote totals should be provided.
Audit Engine can also be run without the CVR but some additional information is required:
Results Summary: The official (or unofficial) results that should match the totals from the cast-vote-records file, but may not if any ballots have been suppressed for privacy concerns. For some systems, we can scrape the data from the official election website report.
Style to Contests File: This is a spreadsheet which can be generated later in the process that lists the contests included in each style. This can be generated by inspecting the ballots or it can be provided by elections officials.
Payment Method: You must create a payment method that we can use to access funds for the cost of the runs. We accept credit cards or we can invoice you after credit approval.
Creating a Job
Name the Job: Create a job with a name for future reference. You must include the district, election date, and election type to distinguish it from other jobs you (and others) might submit.
Direct Uploading: We recommend that you upload your archives directly to our site using our browser interface, which has no limit to the size of each file uploaded.
Posting Service Uploading: In the future, we plan to provide transfers from your file posting service, such as Dropbox, Sharefile, Drive, etc.
Cast Vote Records File(s) (CVR Files):
A cast-vote record file (CVR) provides the result of the election as determined by the voting system.
A CVR can be broken down to the individual ballot (which is preferred) or it can provide results based on higher level groups, such as precincts or batches.
Audit Engine does not require the CVR file, and it does not rely on the CVR file in its own processing of the ballot images.
However, Audit Engine will make use of the CVR to a limited extent, if it is available, as follows:
To redunce the processing required in the initial pre-scan of the ballots.
If no CVR is provided, all ballot images are prescanned to extract style information from barcodes on hand-marked paper ballots, i.e. which contests are included on each ballot style.
If the CVR is provided with style for each ballot, then a full prescan of the ballots is not required. Use of this style information, if incorrect, will be discovered in the style templating and mapping process. Reliance on the style from the CVR does not change the vote extraction process (which is performed without reference to the CVR).
To locate discrepancies between the voting-system tabulation and the audit system.
Only after Audit Engine has fully extracted the vote, the voting system CVR is compared with that result to provide the discrepancy report.
Audit Engine supports voting-system cast-vote-record (CVR) files in vendor-defined formats and in the new "Common Data Format" as defined by NIST or variants.
ES&S Legacy Format:
This is a table-oriented format which results in a reasonable size and number of files.
Use CSV (character separated data) or .xlsx format.
Please limit the size of any single .xlsx file to no more than 99,999 records (100,000) lines by splitting into several files if needed.
Please compress each file using ZIP.
Dominion Legacy Format:
This is a CSV "tidy" format which has only a few columns and many records.
Each record contains fields that identify the data value, and one data value.
Please compress each file using ZIP.
This format has been generally used for batch or precinct level reports and not broken down by ballot.
Dominion JSON CVR format:
This format is similar to the NIST standard but is not an exact implementation.
It results in typically one JSON file per batch on a given tabulator.
For example, in the 2020 Primary in San Francisco, more than 13,000 files resulted.
All the files should be combined into a single ZIP file.
We can add other formats as needed.
Election Information File (EIF):Audit Engine generally also requires an Election Information File which provides the contest names, options, and the full-text descriptions found in each language as an .xlsx or CSV file. The format of this file is found below.
Ballot Options File (BOF): This is actually logically part of the EIF dataset but is only rarely needed when the text for ballot options differs from the official option. This might occur in gubinatorial or presidential contests when the option includes two names, one for governor and leiutenant governor, for example. But it may not be required even if that is the case so you may want to try it first without.
Results Summary: The system requires the official or unofficial results eiher in CSV, .xlsx format, or as a link to a web-page report, if we support that format. Even if the full CVR is provided, Audit Engine will compare with the final totals in the contests.
Adjudications File: After you have completed your first run, you may want to also submit and adjudications files, which essentially amends the Audit Engine result to reflect the review of voter intent from the ballot images on any ballots that are a concern.
Job Controls: You can limit the review to given ranges of ballot image numbers or only to certain precincts.
Election Information File
This file deals with the inconsistencies between various presentations of the same contests and options that are on the ballot. This file is not a standard file from election systems today, but it can be constructed from other data sources, such as the sample ballot, CVR File, Summary report, etc.
Some elements of this data file are specifically to allow Audit Engine to process legacy CVR files.
In legacy ES&S CVR files:
the header may include non-unique names, and are processed based only on their order.
in this case, the original_cvr_header is provided as a starting point, and then the official_contest_name is derived typically by an analyst.
(Required) A brief yet correct and distinctive contest name. These must be an exact replacement for the Cast Vote Record file header. These can contain spaces and punctuation, but it is smart to limit it.
(Optional) This column provides the exact header from the CVR files for reference. This column can be created by copying the header line and pasting it into the spreadsheet using "paste-special" and "transpose". This column will also include the first identifying columns, such as "Cast Vote Record", "Precinct", and "Ballot Style". This column is not needed for non-ES&S systems.
(Optional) The numeric id of this contest
(Optional) The sheet number where this contest is found, if a multi-sheet ballot is used. Sheet numbers start at 1.
(Required) The contest title as actually found on the ballot including all embellishments that are actually written, such as "Vote for 1". If it is exactly the same as the Official_Contest_Name, then it need not be listed. In the case of bilingual ballots, it may work best to include only the English and first portion of the text.
(Optional) If BMDs are used, the this column should provide the exact string used in by the BMD for this contest in the vote summary
(Required) The maximum number of votes in this contest. 1 is assumed if not provided.
(Required) The number of "Write-in:" options provided on the ballot for this contest. If left blank, 1 is assumed. 0 means no write-in lines are provided.
(Required) List of official option names, separated by commas. The order of the options provided here is not required to match the order on the ballot. If these option names differ from what is actually printed on the ballot, then use a BOF file to define the exact text for each option.
This is the exact description of referendum or question-type (Yes/No) contests as found on the ballot. These can be entered with embedded newlines to reflect how it appears on the ballot.
(Optional) List of official option names of qualified writeins. These are provided to BMDs to allow voters to vote for the official writeins and avoid random writeins that are not qualified.
Additional columns can be used as desired by the analyst and are ignored if not supported.
Ballot Options File (BOF)
A BOF is required only if the options as printed on the ballot differ substantially from the official_options as provided in the EIF. This file includes three columns, as follows. This file need only provide records for those options that need to be substituted.
The same contest name as provided in the CVR as official_contest_name.
A ballot option as listd in official_options in the EIF
The option as listed on the ballot, exactly as provided, including newlines. The ballot options are sometimes very different, particularly if they list a candidate pair, such as for Governor and Lieutenant Governor, or President and Vice President, listed on separate lines.
Once your job has been defined, the job can be launched. There are several phases:
Phase 1 -- Uploading files:
The first phase of the process includes uploading ballot images, CVR files and EIF files.
The system will inspect the job specification and the input files for format and consistency.
Consistency checks include comparing the number of ballot images with the number of cast-vote records in the CVR files and that the contests as provided in the EIF can be found in the CVR.
Phase 2 -- Generate Style Templates:
After the files have been uploaded as provided in Phase 1, Style Templates can be generated.
Audit Engine will visually inspect the ballots and learn the format of ballots that have the same style. Each style has either all or a subset of the contests in the election, and in each style, the location of targets for a given contest option will be in the same location.
Style Template generation provides the location of each contest option target and associates it with the correct CVR option.
This is a critical phase of processing. At the end of this process, a set of ballot images which are marked up to show the location of each target and the associated official_contest_name and the official_ballot_option, and these images must be carefully inspected to determine that the correct contest and option is associated with the correct target.
After you have inspected these images, then you can proceed to vote extraction.
Phase 3 -- Vote Extraction and reporting:
After generation of Style Templates and approval of the contest mapping, you can launch the extraction phase.
Audit Engine will detect marks on the ballots and create an comprehensive database containing information for each style and each ballot.
During this phase, the work will be split up among many separate virtual machines in the cloud-based computing services, which can provision 10s of thousands of virtual machines that run in parallel in massive data centers. This will allow us to finish your job in minutes instead of many hours or days if run on a single machine.
Phase 4 -- Reporting and Adjudication:
After the processing is complete, Audit Engine provides a comprehensive report, and allows interactive inspection of individual ballots and contest to resolve voter-intent in difficult cases.
The Adjudications File can then be prepared for future runs.
We are working to establish a full-ballot image securty standard which is based on the concept of "Trusted Systems" to ensure that we can trust that the ballot images are an accurate representation of the paper ballots.
Poll List: We are planning also to support comparison with the "Poll List" which provides the totals of all the voters who voted so we can check that all the ballots are represented in the ballot images set. This is an area that needs further standardization.
Hash Codes File: This file provides the SHA256 hash for each file uploaded to the posting service.
Please note that we do not require that the styles of the ballots be provided, which normally is used internally by election equipment vendors to determine the location of the ovals or "targets" on the ballot. Instead, Audit Engine reads the ballots and learns the styles as it goes.