Harness Attribute Specific Analytics
Use probabilistic AND rules-based entity resolution in one powerful engine
Which is more powerful, entity resolution based on probabilistic analytics or rules-based systems? With Identity Resolution Engine, combine the best of both worlds and apply the best tools for each problem.
You may have heard discussions about the relative merits of rules-based entity resolution using attribute-specific analytics versus probabilistic entity resolution that uses mathematical analytics exclusively. Let's examine a few of the arguments and then look at the facts:
- Rules-based systems can only yield binary answers; i.e. they require that attributes either match exactly or not.
FACT: The best systems use the best of rules and similarity searching technology, enabling them to compute the distance between attributes and make complex decisions based on those calculations. These hybrid solutions can actually run more efficiently while providing better results. - Rules-based systems demand that all data sources be centralized and rationalized.
FACT: Flexible systems do not require data sources to be centralized, warehoused, or have conforming schemas (external link). - Probabilistic-only entity resolution systems enable decision-making based on relative likelihood, while rules-based systems do not.
FACT: It's possible to combine the best of rules and probability to create effective identity resolution solutions that make decisions based on relative likelihood.
"One size fits all" doesn't work well in many domains, and it unnecessarily constrains the development of entity resolution solutions. For example, suppose your solution needs to include automobile license plate numbers as one attribute to help resolve entities. Mathematical probability won't detect that "13" is similar to "B" while an attribute-specific analytic quickly makes the connection.
So what's the takeaway? Be skeptical when you hear that rules and probability don't mix. More importantly, question why you have to choose one or the other when you could have both.
