Friday, January 05, 2007

CDI - Matching Engines

Another tip from CDI expert Jill Dyche on the subject of matching engines and whether or not to base your CDI software decision entirely on the quality of the match engine:

Here are some other things to keep in mind when assessing matching engines:

  • Test with lots of real data; compare and benchmark results. Organizations testing CDI hubs should test with as much data as possible to get a more real-world test of the system, Levy said. DataFlux's Gidley also recommends profiling data sources to assess the accuracy and completeness of data. This can help with tuning the matching engine's business rules. For example, if 50% of the phone number fields in a database are empty, it might not be a good matching attribute.
  • Consider the interface. When a system isn't sure whether records match, it generally refers the matter to a human data steward to make the call. So the interface for data stewards is an important part of a tool decision, Gidley said.
  • Think globally and futuristically. Language differences, industry-specific requirements, and future infrastructure plans -- such as moving to a service-oriented architecture -- can also affect matching engine choices, according to Dyche.
Overall, Dyche said, the matching engine should be only a part of the overall CDI tool choice.

"Matching is one of many decisions to make," Dyche said. "When we see these vendor bake-offs and companies get down to [which CDI vendor] has the most accurate match, that's still only one component of the decision."

So, in some cases, the CDI tool with the most accurate matching engine might not be the final choice. Companies must consider the big picture, including data volumes, processing speed and functional requirements, Dyche said.