Fuzzy matching
algorithms
Enter two strings and compare them across all algorithms at once — edit distances, similarity scores, and phonetic codes. No registration. No logging.
Match rules: similarity ≥ 0.80 • distance ≤ 3 • phonetic codes equal. Test algorithms individually →
About these algorithms
Each algorithm measures string similarity differently. Here's when to use each one.
String Distance
Counts the minimum insertions, deletions, and substitutions to transform one string into another. The most widely used edit distance metric.
Extends Levenshtein with transpositions — swapping two adjacent characters counts as one edit. "teh" → "the" costs 1, not 2.
The true Damerau-Levenshtein with unrestricted transpositions. Covers over 80% of real-world spelling errors.
Based on the Longest Common Subsequence. Allows only insertions and deletions — useful for sequence comparison.
String Similarity
Measures matching characters and their order. Returns 0–1. Widely used in census data processing and record linkage.
Boosts Jaro scores when strings share a common prefix. Particularly effective for name matching.
Converts strings to bigram vectors and measures the angle between them. Length-independent and used in RAG and search.
Intersection divided by union of bigram sets. Good for address matching where word order may vary.
Similar to Jaccard but weights shared bigrams more heavily. Tends to give higher scores for partial matches.
Compares frequency-weighted n-grams. Robust against transpositions and used for candidate pre-filtering.
A library built on Levenshtein distance with preprocessing and multiple matching modes. No calculator available due to licensing restrictions.
Phonetic Encoding
English phonetic algorithm producing a 4-character code. "Smith" and "Smyth" both encode to S530.
Optimised for German. Correctly handles umlauts and German consonant patterns. "Müller" and "Mueller" encode identically.
More sophisticated English phonetic encoding than Soundex. Handles silent letters and consonant combinations.
Don't implement these yourself
Tilores combines all these algorithms with data transformation and configurable rules to resolve entities at production scale.
Start Building Free