Companies House
directors deduplicated
Companies House is the UK's registry of companies and directors. Unfortunately, the database has many duplicate director records. So we used Tilores to deduplicate the entire dataset.
10 million records. 614,121 duplicates linked.
Full UK Companies House register
After Tilores identity resolution
6% of all records were duplicates
Static snapshot of Companies House data as of end of July 2023. Results are experimental and for research purposes only — false positives are unavoidable given the data's limitations.
Duplicate director density across the UK
Each region shows the number of director entities, total profiles, and how far the data deviates from the ideal of one profile per person.
Even famous directors have duplicates
The husband of Victoria Beckham has 6 separate Companies House profiles across 15 companies — including Victoria Beckham Limited. Companies House actually shows seven profiles; one is only linked to a dissolved company.
Three versions of Edward Christopher Sheeran exist in the registry, across 11 companies — including FAT PUNT LTD and the catchily named HAYAGOTATOURBOI TOURING LLP.
Our favourite former Dragons' Den dragon — once in the Royal Navy, court-martialed for threatening to throw his commanding officer overboard. Monaco resident Duncan has 8 profiles and is linked to 33 companies.
Another Dragons' Den personality with 8 duplicate accounts across 43 companies. A data entry error even listed Peter's former name as "James Holdgate" — who turns out to be a company secretary.
Why is Companies House director data so bad?
Every time you register a new company and list yourself as a director, Companies House treats you as a completely new record if you use a different registered postal address.
Companies are only added once to Companies House and have a unique ID — the company registration number. Individual directors, by contrast, are added multiple times by the directors themselves, with no unique identifier connected to them. A classic identity resolution problem.
UK company registration costs as little as £12 and takes minutes, but Companies House does not verify the accuracy of the information filed. The Guardian has described the registry as full of fakes, fast sign-ups and frauds ↗.
The consequences of unresolved director data
Searching for a director in Companies House doesn't show all their associated companies. You have to search separately for each name variant and manually work out which records belong to the same person.
When directors have multiple unlinked profiles, compliance teams can miss the full picture — directorships at companies that should have triggered screening are invisible.
Customers with multiple accounts can become a compliance liability. Banks performing perpetual KYC should know the instant a new customer joins whether they're related to an existing account.
Hidden connections between companies through shared directors — using slightly different name spellings — can mask fraud networks that only become visible through entity resolution.
How we deduplicated 10 million director records
Downloaded ~5 million companies via the Companies House API — rate limited to 600 queries per 5 minutes. Total download time: over one month for all 10 million director records.
Used data samples to build fuzzy matching rules using Metaphone and Levenshtein distance algorithms to catch name variants, abbreviations, and spelling differences.
Imported all 10 million records into Tilores. AWS serverless architecture means import speed is only limited by how many resources are deployed — not a technical bottleneck.
Two director records were linked when they matched on similar name + same birth month/year, or similar name + identical postcode. Name similarity was calculated using Metaphone combined with Levenshtein distance.
Important: Companies House only publicly provides the birth month — not the day. In a general population this would produce too many false positives, but for company directors (a smaller, specific subset) the match rate is acceptable for a public showcase. False positive matches are unavoidable with this data.
Why we can do this
Companies House information is publicly available and free to search. The data forms the basis of all business credit reports supplied by agencies like Experian and Creditsafe, and is used by many organisations for due diligence, marketing, and lead generation.
Tilores relies on the legitimate interest ground for processing this personal data — providing a free service to help people conduct better due diligence on companies and their directors, to reduce risk and avoid fraud.
This is an experimental showcase. Results should not be considered definitive and are for research purposes only. No accuracy guarantees are made.
Want to resolve your own data?
If Companies House itself wanted to fix this — they're hosted on AWS, so Tilores could sort their data quickly. Same goes for any organisation that relies on Companies House data.