Thursday, March 29, 2012

Justin Bieber abuses Twitter but proves how similar phone numbers are

Look, I can't believe I have managed to work Justin Bieber into a data management blog, but I have.

When Bieber tweeted 9 digits of his phone number and asked his twitter followers to guess the 10th digit and call him, he set two unsuspecting victims phones on fire. He also proved how similar phone numbers are and why they are bad candidates for match strategies.

Using phone numbers in match strategies is, in my opinion, a waste of time.  You are only going to increase your chances of generating false positives (unless you do an exact match).  My issue with the exact matches is that it is very easy to make a "fat-finger" error and still identify a false positive.

If you care, here is the link to the Bieber event.  If you are a Bieber fan and found your way to my blog by some search engine failure, my apologies (please don't terrorize me).

Justin Bieber abuses Twitter with phone gag, may get sued - Technolog on msnbc.com.

2 comments:

  1. Good one William,

    Using phone number(s) in matching may not be a great idea as you mentioned, but I still tend to take it as an affirmative criteria when most of the other critical elements end up being good matches. Say you have a name, address, DOB, gender and identifier match, a matching phone number kind of confirms a good match. Usually we apply edit distance on phone. In above scenario we might give a good boost to overall score if it's exact phone number match and lower(significant) if the match is 1 or more edit distance.

    My 2 cents

    -Prashant

    ReplyDelete
  2. They can be used as a confirmation of accuracy, however, if you consider how multiple customers can share the same phone number this may not be as accurate as you'd think. In business situations different people can share a phone number or have very common phone number. In retail type situations multiple customer households complicate this in the same way.
    I have used phone number as confirmation of an accurate match, however, that can be misleading.

    ReplyDelete

What data quality is (and what it is not)

Like the radar system pictured above, data quality is a sentinel; a detection system put in place to warn of threats to valuable assets. ...