An unnamed debt-collection company from Germany was recently fined €900,000 under the General Data Protection Regulations (GDPR). Their misdemeanour? Not deleting data about individuals.
Below we will outline how you can avoid receiving such eye-watering fines in your own company, but first let's address why this is so difficult in the first place for some companies.
Data deletion is a particularly challenging problem for debt collection firms. We should know - we used to supply them with data when we were the leadership team at a German consumer credit bureau.
A typical debt collection company will have data from multiple different sources. Firstly their clients - these are the claims that they are tasked with collecting. Then there are the various different data sources, such as credit bureaus, with which they work. Debt collection agencies need data from different sources to know a few specific things:
-
What is the debtor’s current address? They may have moved home since the debt was incurred.
-
Are they actually still at their last known address? The post service can confirm this.
-
What is the debtor’s solvency status? If they have recently been declared bankrupt there is little point in trying to collect the debt.
-
What is the debtor’s current credit status? The better their credit the more likely you are to collect the debt.
Most debt collection agencies will work with a number of data providers to make sure they get the most accurate, relevant data at the best price.
Importantly, each of these different data points will have a different retention period. Data from one source might need to be deleted after 6 months, whereas from another source it may be permissible to retain it for several years.
What complicates matters, is that a debt collection agency might be collecting multiple debts from the same person, without even realising it. Consider the below example - minor differences in the name and address details make the Kaufmarkt and EnergieNetz debtors look like different individuals.
Then consider that the person may have moved address then incurred a debt with another company. With no unique key to link this person together, in most databases this would be treated as three different identities. Not only does that make data compliance difficult, it also means the debt collection company will have paid for data about this person three times, and will not consider their overall indebtedness when they try to collect any of the individual debts.
This is a classic “identity resolution” problem, and was exactly the data challenge that we solved while working at the credit bureau, leading to the founding of Tilores.
So then when should data related to this person be deleted? Using the first debt’s data as the starting date? Or the most recent?
Looking at our above example, we can apply identity resolution techniques using “fuzzy matching” rules to identify that the names are similar, the addresses of the first two are similar, as well as the phone number, and that the email address of the first and third record are the same, therefore they are all the same identity.
Consider then a more complicated example, where a person has moved house several times. Their identity may have a chain of 5 connected records: A > B > C > D > E. In this construct, A and C have no direct connection, but we know they belong together via the connections of B, C and D.
So someone may have incurred a debt when they lived at address A, but now we know they live at address E.
But what if we are obliged to delete record C, because it is actually the oldest record we hold about this person? Well, legally we can no longer justify that A and B belong to the same identity as D and E - we must now consider them as separate identities.
In a well-designed identity resolution system, the provenance of each data record is recorded within the so-called “identity graph” - the representation of all data about a person - along with the date the data was ingested into the system.
On a regular, on-going basis, data from each source must be deleted from the system based on the permissible retention period - and hence from each individual identity graph - automatically. What is particularly challenging is to maintain data integrity by ensuring that the ABCDE example above will become AB and DE when record C is deleted.
Assuming the identity resolution system is acting as the source of truth, but that original data records are also stored in separate databases, then these data sources must be instructed, via the identity resolution system - to delete these individual records simultaneously. This is achieved via an “entity-stream”, which lists every single change that happens to the identity graphs in the identity resolution system, as they happen in real-time.
Any company holding lots of data about individuals will struggle with this data governance challenge, but identity resolution technology can be relatively straightforward to implement. Importantly, you don’t need to change your original data stores at all so the time to value is very fast.
If identity resolution technology means you could avoid a fine of 2% of your turnover, then any company that cares about its bottom line should be actively investigating the deployment of such technology today. If they are not - it is worth asking your data team how they currently make sure consumer data is deleted to avoid GDPR fines.
Talk to the Tilores team today to learn how our industry-leading identity resolution technology can help your company stay GDPR compliant.