Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Open source data quality software to help with name matching

Learn what to consider before adopting open source data quality software to help with identity resolution and name matching.

Just wondering if you could recommend a low-cost identity resolution tool for advanced name matching – preferably some open source data quality software? We’re looking for something that can take into account out-of-order names, misspelled names and name variations in various languages – for example, Francis Julien = Frank Julien = Julien Franciose.

I would prefer to give you this advice. First of all, a quick Web search will reveal a number of open source and/or free options, so, you’re probably in as good a position as I am to answer this question. In addition, like I did 15 years ago, you can spend a little time reading up on generally accepted algorithms (edit distance calculations, n-gramming, data edit rules – check out dataqualitybook.com  to get a link to my new book that has chapters discussing these things) and then implement them yourself.

You also have to consider the total cost of ownership associated with open source data quality software and determine whether the upfront effort required to get the product up and running and to find the right expertise to help you adjust the rules in the software offsets the value proposition of investing in tools whose vendors will get you going relatively quickly. I’m not advocating one way or the other, just that you have to think about what is best for your environment.

Dig Deeper on Data quality management software

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.