Identity Resolution for LLMs: What IdentityRAG Is and Why It Matters
TL;DR
Identity resolution for LLMs is the process of matching a person's scattered records across systems into one accurate, unified profile that a language model can retrieve at query time — instead of letting the model guess who the customer is.
IdentityRAG is the pattern where the LLM queries a dedicated, real-time entity-resolution system for that profile, rather than performing identity matching itself — which it does unreliably, slowly, and unexplainably.
It matters because a customer-facing LLM that confuses two people, or stitches together the wrong records, doesn't just look bad — in regulated workflows like KYC, AML, and support, it is a compliance and trust failure.
Identity Layer
Resolve the customer before your AI reasons.
Resolve customer context before the model reasons over it.
Resolution Path
Large language models are extraordinary at language and unreliable at identity. Ask a model who “J. Smith, +44 7700 900123” is, and it will confidently assemble an answer — sometimes from the right records, sometimes from two different customers who happen to share a name. For a marketing chatbot that’s awkward. For a bank’s onboarding agent, it’s a regulatory incident. This article defines identity resolution for LLMs, explains the IdentityRAG pattern that fixes it, and shows why it is becoming a required layer in any AI system that touches real customer data.
What is identity resolution for LLMs?
Identity resolution for LLMs is the process of resolving a person’s or company’s scattered records — across CRM, support, billing, and product systems — into a single accurate profile that a language model can retrieve when it needs to reason about that specific entity. Entity resolution itself is, in Tilores’ own words, “the connecting of non-identical, related data from disparate sources to ‘entities’,” where entities can be people, companies, or financial transactions. The “for LLMs” part is the new requirement: the resolved profile has to be available at the speed and in the shape an LLM needs, in real time, inside the model’s context window.
The distinction that matters is who does the matching. Without identity resolution, the LLM is implicitly asked to do the matching itself — to decide, from a pile of retrieved text, which records belong to the same person. With identity resolution, a purpose-built system makes that decision consistently and explainably enough to be audited, and the LLM simply consumes the resolved result.
What is IdentityRAG?
IdentityRAG is a retrieval pattern in which a large language model retrieves customer context from a dedicated, real-time entity-resolution system instead of resolving identities itself. Tilores describes it directly: “Tilores uses IdentityRAG technology to provide context about your customers to Large Language Model (LLM) chatbots,” connecting the model — via LangChain and Amazon Bedrock — to a Tilores instance holding unified identity data.
The crucial design move is that the pattern inverts the relationship: LLMs query a specialized system rather than performing resolution themselves. Standard retrieval-augmented generation (RAG) fetches documents by semantic similarity. IdentityRAG fetches a resolved entity: every record that genuinely belongs to one customer, merged into one profile, with the matching logic living in the entity-resolution layer rather than in the model’s guesswork.
How does IdentityRAG work?
IdentityRAG works in three stages — unify, resolve, retrieve — executed at query time rather than baked in ahead of time. Tilores breaks the flow into three components:
Data unification. Scattered customer data from every source system — Salesforce, HubSpot, Zendesk, Mailchimp, Snowflake and more — is connected into one searchable layer.
Identity resolution. Matching algorithms work on "any available identity attribute similarity, including name, address, date of birth, email, phone number, device," using fuzzy, probabilistic matching rather than exact string equality.
Real-time retrieval. The system "builds dynamic customer profiles at query time," handing the LLM a unified, current profile in real time.
The detail that separates this from a nightly data pipeline is the read-time golden record. Tilores defines the golden record “at read-time, as opposed to at write time,” which means the unified profile is assembled when the agent asks — reflecting the latest data in every source system, with no stale pre-computed merge. Tilores reports response times of “less than 150 milliseconds,” fast enough to sit inside a live conversation.
Why can’t an LLM just do entity resolution itself?
An LLM cannot reliably do entity resolution itself because it fails on the five things production identity matching demands. The Tilores team originally built its resolution engine for a European consumer credit bureau’s fraud-prevention and anti-money-laundering work. Their argument against using an LLM as the matcher is “a sledgehammer for a nut”:
Consistency. "Ask an LLM the same question in slightly different ways and there is a good chance you get a different answer," alongside "straight out hallucinations." Identity matching has to be repeatable.
Explainability. "'LLM says so' is not good enough to explain why two data records have been linked." Regulated workflows need an audit trail; a configurable matching engine backed by an entity graph provides one.
Precision and tuning. Different jobs need different thresholds — a credit bureau optimizes against false positives, fraud detection tolerates them for recall. Models can't be tuned this way; configurable matchers can.
Temporal and relational memory. Name changes, multiple addresses over time, corporate mergers — these need a persistent entity graph, which "an LLM cannot really maintain."
Speed and cost. "LLMs are just slow. Processing a single record could take a few seconds," and LLM-based matching can be "100x more expensive, and possibly 1000s of times more expensive" than a specialized system.
The takeaway isn’t “don’t use LLMs.” It’s: let the LLM do language, and let a dedicated system do identity. That division of labour is exactly what IdentityRAG formalizes.
Where does identity resolution belong in an AI stack?
Identity resolution belongs between your source systems and your LLM, as the retrieval layer the model calls whenever it needs to know who it is talking about. In practice it sits alongside — not inside — your vector database. The two solve different problems: a vector store retrieves semantically similar text, while an entity-resolution API retrieves the correct, merged record set for one identity. The following comparison makes the split concrete.
A mature customer-facing AI system usually uses both: vector RAG for “what does our refund policy say,” IdentityRAG for “who is this caller and what have they bought.”
Why does identity resolution for LLMs matter now?
It matters now because AI has moved from drafting text to acting on real customer records, and the cost of getting identity wrong has changed from embarrassing to dangerous. A 2024 model summarizing a support ticket could afford to be approximately right. A 2026 agent that reads a caller’s history, authorizes a refund, or makes a KYC decision cannot — confusing two customers becomes a privacy breach, a wrong financial action, or a compliance failure. The table below shows how the stakes scale with autonomy.
Tilores’ own customer evidence shows the upside of getting it right: fintech lender Banxware reports a “99% decrease in time spent on manual customer credit decisions” after resolving identity in real time. The pattern is patented (US Patent No. 12,248,479 B2), SOC 2 certified, and GDPR-aligned — signals that matter precisely because the use cases are regulated.
Frequently asked questions
Is identity resolution the same as RAG? No. RAG retrieves relevant documents by semantic similarity. Identity resolution retrieves a unified entity — every record belonging to one customer, merged. IdentityRAG combines them: it is RAG where the retrieved context is a resolved identity rather than a document chunk.
Is IdentityRAG a Tilores product or a general pattern? IdentityRAG is the name Tilores uses for the pattern of grounding an LLM in a dedicated entity-resolution system. The underlying idea — resolve identity in a specialized layer, then let the LLM query it — is general; Tilores is the implementation that coined and productized the term.
How fast does identity resolution need to be for a live LLM conversation? Fast enough to be invisible inside a turn of dialogue. Tilores reports under 150 milliseconds, which keeps the resolution step from adding noticeable latency to a chatbot or voice agent.
Why not just fine-tune the LLM on customer data? Fine-tuning bakes a snapshot into the model, goes stale immediately, can leak data between customers, and still can’t guarantee which records belong to the same person. Real-time retrieval from a resolution system stays current and keeps identity logic auditable.
What is a read-time golden record? A golden record is the single, merged version of a customer assembled from many sources. “Read-time” means it is assembled when you query, not pre-computed — so it reflects the latest data in every system and can be shaped differently for different questions.
Does identity resolution replace my CDP or MDM? Not necessarily. A CDP centralizes marketing data and an MDM governs master records, often in batch. A real-time entity-resolution API serves the live, query-time identity an LLM or agent needs. Many teams run them together; the comparison piece linked below covers when each is the right call.
Is fuzzy matching accurate enough for regulated use? Probabilistic matching on multiple attributes (name, address, date of birth, email, phone, device) is how credit bureaus and fraud teams already operate, because configurable thresholds and an explainable entity graph give the precision and audit trail regulators expect — something an LLM’s opaque judgment cannot.
How do I add identity resolution to an existing LLM app? Connect a real-time entity-resolution API as a retrieval tool the model calls — for example, via a LangChain integration or by exposing the API as a tool to your agent. The how-to guide linked below walks through the build.
Author context
Steven Renwick is CEO and co-founder of Tilores, where he works on real-time entity resolution, IdentityRAG, and customer identity infrastructure for AI systems.
Related reading
[Entity resolution for AI: API vs vector DB vs MDM vs CDP (and how to choose in 2026)](https://tilores.io/content/Entity-Resolution-for-AI-API-vs-Vector-Database-vs-MDM-vs-CDP-and-How-to-Choose-in-2026)
How to give an AI agent a real-time, resolved customer view (RAG + MCP)
[Tilores: IdentityRAG for LLMs](https://tilores.io/RAG)
[Tilores documentation](https://docs.tilotech.io/tilores/)
[Langchain documentation for implementing Tilores](https://docs.langchain.com/oss/python/integrations/providers/tilores)
[Tilores IdentityRAG Github repo](https://github.com/tilotech/identity-rag-customer-insights-chatbot)
[Identity resolution glossary](https://tilores.io/identity-resolution-glossary) Ready to try entity resolution?
Start Building Free →