No, I don’t mean fake voters. I mean they don’t really have a problem with duplicate voters. Or at least they shouldn’t…
So it turns out I was wrong. We recently asked if the US has a voter data duplication problem. We downloaded data on 50 million voters from 7 states in the US, ran them through Tilores identity resolution engine, and found 0.8% of the voters (nearly 400k) had a duplicate profile in either the same state, or one of the others in our sample.
Moreover, we found a thousand cases where these individuals had likely voted in both locations - evidence of electoral fraud.
However, it turns out that the US doesn’t really have a problem with duplicate voters. It has a problem with partisan politics forcing Republican states to make completely illogical decisions to reduce the quality of their voting data.
When we discovered these duplicates we naively assumed that we were the first to discover the magnitude of this problematic data. We were not.
It turns out that in the United States there is a non-profit, non-partisan organisation called “ERIC” (the Electronic Registration Information Center), which already does exactly what we had done, working directly together with many US states to improve their voter registration lists.
Using entity resolution technology from our competitor, Senzing, ERIC compared data from more than half of the US states - with a healthy 50:50 split of Republican and Democratic states - using entity resolution software to help clean up voter lists.
Importantly, ERIC also used extra data sources, such as the driving licence registers from the Department of Motor Vehicles (DMV) to provide an extra layer of data validation, all in a data-privacy conserving manner.
We had a quick call with them to compare methodologies, where they pointed out that they generally catch the same duplicates as us, and had also detected the duplicate votes that we had. Let me be clear - they do good work with the data. However, like many things in the data world, the challenge is not just about improving the data - it is what you do with that data downstream. And that (actually removing duplicate voters) is outwith ERIC’s control.
So all good, right? ERIC exists already and they do good deduplication work to help the US improve its voter registration lists and detect fraudulent voters. Right?
Nope - it turns out that ERIC has become the target of some truly bonkers partisan attacks from far-right activists, which has led to a number of Republican states withdrawing from the consortium. Not only have these states withdrawn, with no plan of how to replace ERIC’s function thus reducing the quality of their own voter lists, their withdrawal has a knock-on effect to the other states who are less able to detect cross-state duplicates.
The accusations of these far-right activists seems to centre around the “problem” that ERIC, as well as identifying duplicates, also identifies individuals that are eligible to vote but are not yet registered. Apparently these activists consider that such activity could only benefit Democrats.
I can’t claim to understand why far-right activists would consider that helping more people to vote would be counter to their interests, but there you go. That is the current state of US politics. To get more understanding of this frustrating and complex topic, check this article from Votebeat.
In my apolitical, non-partisan opinion, the only question about ERIC should be - why does every single US state not use them?
If you are working with voter data from a country with a slightly more sensible political landscape than the US and would like to talk about cleaning your voter lists and helping more people vote whilst detecting any potential voter fraud - then please get in touch!