This Page: https://copswiki.org/Common/M1961
Those are great points and I am sorry I added an evaluation of their report. But I will add my opinion here
because I did a lot of work to model these audits in Monte Carlo simulations and so I learn a lot
about what is going on.
But I also will say that in my experience, the actual tallying process was not generally where I have
found serious problems. The random selection (which they did not need to do) and then decision
to declare that ther
The concept of risk limiting, per Philip Stark and others that promote these audits, is that by adding
more samples to the audit, then the sampling risk is reduced. If you pull samples randomly from a
population, then you will get a pretty good idea of the margin of victory long before a full hand count.
In the procedures popularlized by Philip Stark, you don't actually know how many you will need to
pull in advance, although the official results will give you a good idea. If the contest ends up closer
than expected, then the procedure may go on longer, while, according to these procedures, if the
evidence is clear from the sample, then they may stop early.
In practice, doing an evaluation one ballot at a time is not practical, so the evaluation is done in
rounds. I assert from my simulations that you only ever need two rounds because by the end of
the first round, you will know enough about the population to determine the size of the second
round (for ballot comparison RLA). But regardless of the procedure, these statistical sampling
procedures have a key weakness: more samples are needed the closer the margin becomes.
One interesting thing is that if the samples are pulled randomly, then it matters not how many
ballots are in the population, only the margin of victory.
In the case of a "ballot comparision RLA" the audit basically compares deviations on a ballot-by-ballot
basis between ballot samples and the Cast-Vote Records, where there is one record per ballot, and
compares the exact vote count with that of the CVRs.
This type of RLA audit is checking whether the CVR is accurate. If you compare ballot-by-ballot to the
cvr, then you can keep track of what has been termed "overstatements" and "understatements", i.e.
where the official results overstated the margin for the apparent victor and the other is the opposite.
If the margin was overstated in the official results, then it might flip, if enough overstatements
are found. But each overstatement might be offset by an understatement, and all elections have
a bit of "noise" like this. I modeled the concept of noise in my simulations but it is not something
the literature talks about. And since noise exists, almost no audit of this type should be perfectly clean.
The big problem with ballot-comparison RLA is that it requires that you have a way to
pair up the voted ballot with the CVR.This can take an incredible amount of
time, just to maintain all the paper in strict order so you can find them. Pilot RLAs in
Rhode Island found the amount of time prohibitive, esp. if the procedure required that
just 30 ballots be randomly pulled, which no observer finds satisfying.
But even worse, precinct scanners shuffle
the entries in the CVR report it produces and the ballots fall into a bin without staying in
order. This is due to privacy concerns, so it is not possible to match up the order that
people exited the polling place with their ballots.
So that type of RLA is out in GA and most places that use precinct scanners.
A "Ballot Polling RLA" essentially does what a pollster does. Sample the ballots randomly
and create a margin of victory as stated by the samples. The Margin of Error (MOE)
is the same as what they use in exit polling, but you don't have to rely on what someone
says. This requires many more ballots be sampled individually, easily 10,000s
if the contest is tight. You can predict pretty well how many up front for a given
risk limit. But the literature provides overly optimistic numbers based on the
"average" expected number of samples, because it is true that if you evaluate as you
go, then you may be able to stop early. But this provides a number in half the cases
and means practitioners can't easily predict the number to be sampled.
But you can predict the worst case number required for any given risk limit, if
the results are not even tighter than the published margin of victory. Of course, if you
get done with that number, you may find that the contest is even closer and you have
THERE IS A THIRD type of risk limiting audit, which is a batch comparison type.
You compare one batch at a time against the official numbers for that batch. I like this
type of an audit because it keeps the batches / precincts together and there is no
ballot-by-ballot sampling. They go faster because you can more easily go through
a batch than to go out, find a single ballot, count it, then at the end, put it back.
The other good thing about this audit is that if you find that the contest(s) are closer
than you thought, then all you need to do is add more batches. With the other
types, if you find a full hand count is called for, it is not easy to switch from ballot
sampling to the more-efficient-to-count batches. This is another key reason why
I don't believe the ballot sampled audits are practical to implement if the margins
are close at all and there is this danger.
So indeed, even if it may not be well stated, in these type of ballot-sample RLA procedures,
you have to "switch gears" if it starts to get too close and go to the full hand count
by batch. The RLA literature sort of skips over this harsh reality.
They did not have the option of using the Ballot Comparison RLA.
Then you have to ask, if they should do a batch comparison audit, or just
count all the batches and scuttle the concept of sampling.
A full hand count is, in fact, an RLA, because then the sampling risk is
zero, so you have limited the risk in the only way a risk-limiting audit can do it.
In addition, if the margin is close, this a good option because it also eliminates
all the risk of the sampling procedures. We don't need to choose random
numbers or rely on workers to choose samples fairly.
I agree with Garland Favorito about the silly use of ARLO but it is common
to have this dialed into the law so there is not an option. The good news
is that they also used hand-tally sheet recording, and that should be availble
for review, allowing us to skip over ARLO in our review of the audit.