Free Tool

Sørensen-Dice Coefficient

Similarity metric that measures overlap between two samples using character bigrams.

Tilores uses Sørensen-Dice in production — so you can automate matching with rules you configure.

Download Tilores Studio (Free)

Try it yourself

String A

String B

Similarity Score

0.9375

Match

How it works

The Sørensen-Dice coefficient is similar to Jaccard but weights the intersection more heavily: it divides twice the intersection by the sum of both set sizes. This tends to produce higher similarity scores than Jaccard for partial matches. Like Jaccard, it uses character bigrams and is insensitive to string order. It is commonly used in data deduplication where a slightly more lenient matching threshold is desired.

Use cases in entity resolution

Text similarity scoring

Duplicate detection

Data quality assessment

Record linkage

Related tools

Cosine Similarity Calculator

Measure the cosine of the angle between two strings represented as vectors. Used in RAG, recommendation systems, and information retrieval.

Jaccard Similarity Coefficient

Compare the overlap between two sets — the intersection divided by the union of character bigrams.

Q-gram (N-gram) Similarity

Divide strings into substrings of length Q and compare them to determine similarity.

Don't implement this yourself

Tilores Studio runs the full matching engine — this algorithm plus configurable rules and real-time entity resolution — locally on your machine. Free, no account, no cloud. Load your own data and see it working in minutes.

Download Tilores Studio