Classicaly, data deduplication is the process of eliminating redundant copies of data, ensuring that only one unique instance of the data is retained in storage. The redundant data can then be replaced with a reference to the unique data copy. This process not only saves storage space but also improves data management efficiency.
Tilores is identity resolution software. This means we handle data deduplication a little bit differently.
Tilores deduplicates record data when the record attributes used to match together mutiple data records (e.g. name, address, email adress) are identical. We do this to minimise the connections between data records, so that entity resolution performance is optimal. Other data about the record may be non-identical.
For example, consider this Kevin Bacon based example. If we have four records based on characters played by Kevin Bacon in his movies, then each record is unique. However, if we match those four records using Kevin's real name and data of birth, then all four records are connected to each other. This is inefficient.
Instead of connecting all the Kevin Bacon records to each other, in Tilores we make the first Kevin Bacon record (Ren McCormack from Footloose) the "master record" and all the "non-identical duplicates" are connected to that master record.
This way, no data is eliminated. All data associated with the non-identical duplicates is retained and available via API.
Any subsequent Kevin Bacon records that are submitted to our dataset will either be deduplicated and connected to this master record, or if there is actual variance in the record in the matching attributes, then record linkage will occur between that record and the deduplicated records.
Data warehouse deduplication
Connected clients compliance
Pre-CDP customer profile stitching
Company data source of truth
©2023 Tilores, All right reserved.