Thursday, July 1, 2010

Master Data Management: Address Validation Series

Why?


Address information, in particular customer address information, is a core asset of any business.  It plays a pivotal role in two fundamental business operations; revenue assurance and revenue generation.

Without valid, deliverable customer address information collecting payment for services or products is often a process that, at best, requires repetitive efforts that cost the business labor and resources (and dollars).  At worst, the process fails to collect, creating an obvious issue costing time and resources (and dollars).

Without valid, deliverable customer address information marketing to existing and potential customers is not possible and will, again, cost the business labor and resources (and dollars). 

What?


So what exactly needs to be validated in order to prevent the failure of revenue collection and generating events?

While it is not harmful to have the full compliment of customer address data collected, stored, and validated, there are a few pieces of address information that are essential. 

Postal Code is an absolute must have in order to ensure the mailer be delivered.  Postal Code is the core element that the United States Postal Service uses to route mail.  Without it deliverability is unachievable.

Street Number and Street Name are also essential pieces of information to collect and validate in order to ensure mail deliverability.  Logically, this information is required to know where within the postal code to deliver the mail.

It is also required, where applicable, to collect and validate additional address location information such as apartment number, suite number, building number, etc.  This enables getting the mail to correct destination within a multi-dwelling residence.

In my opinion, this is the required data to validate and ensure deliverability.  Most address validation services can derive accurate and valid city and state information from the postal code which can be augmented and utilized moving forward.

Who?


Who should be responsible for address validation?

As I eluded to earlier, address information is a corporate asset which plays a pivotal role to many essential business operations.

For this reason, address validation belongs in a centralized group made up of representatives from those dependent parties.  In other words, address validation is the responsibility of a corporate data governance group that is aware of all the required aspects of useful address data management.

Typically there are, at a minimum, two levels to this group.  On one level there are business stakeholders that manage and advocate functional business requirements involving address information.  On the other level are the data processors that manage data sourcing, scrubbing, validation and integration of the address information.

Due to the technical requirements in managing such information, Information Technology should be responsible for management of the physical data stores that house the address information.  However, it is crucial to note that this management is around the software and hardware resources that house the data. 

It is imperative that data ownership be the responsibility of the business owners on the governance group.

Where?


Although there are various implementations of MDM, I believe address information belongs in a a centralized hub that feeds dependent systems clean and valid address data.  This model ensures the delivery of consistent and valid address throughout the enterprise. 

This centralized hub needs to be managed in such a way that it is independently supported ensuring failover, redundancy and archival.  This eliminates the failure scenarios described earlier that interfer in revenue collection and generation.

When?


How often does address data need to be validated?

There are various factors such as an annual change of address of  17% and quarterly marketing campaigns that influence when address information should be validated.  In the end, the answer to when should address data be validated depends on the lowest level of granualarity that the data is used to support business operations that either collect or generate revenue.

If marketing conducts campaigns on a quarterly basis but billing occurs monthly, than validating address data should be done on a monthly basis to support accurate and efficient billing operations.

How?


How can address validation be implemented in order to support all the benefits described?

In order to validate address information on a periodic basis, manage it across various dependent business units and integrate it into a centralize hub you need to be able to develop validation routines, business rules, a mechanism for business stakeholder review and integration routines that can be executed in a scheduled format.

Within the domain of address validation there are several varieties of output.  For instance, it possible to develop an address validation process that transforms address information into the correct formatted address lines that would appear on the envelope.   Another implementation could be the parsing, augmenting and obtaining validation status of address information.  Yet another implementation could be to take the address information input and transform it into the valid delivery address information.

With various business units consuming address information, there will likely be various business unit specific rules to process the address information.  For example, marketing operations might require the "vanity" city name be specified.  Vanity city names are usually preferred by customers due to their perception and reputation.  One such example of a vanity address is using Beverly Hills over the validate city name of Los Angeles.  However, billing operations may not have the same requirement.  In this case, and others like it, you need an address validation process that enables the building of business specific rules that can handle variability on the same data element.

In order to enable business stakeholder ownership and help business users define and validate data specific rules, you need to have a mechanism that presents data to these users.  Since these business stakeholders are not typically technically inclined, this mechanism needs to be built in such as a way that minimizes technical effort and enables data review and validation.

Ultimately this address information needs to be integrated into a centralized hub and distributed to the various consuming applications.  This dictates the need for enterprise capable data extraction and load features such as scheduling, monitoring and tracking.

How do you deliver on such a complex set of criteria?


It's a challenge.  In fact, it's such a broad topic with many details that it is not feasible to do in one blog post. I plan on addressing (no pun intended) each of these areas in more detail over the coming weeks. 

So stay tuned to The Data Quality Chronicle for more!

8 comments:

  1. Great, comprehensive post.

    The HOW is much simpler than people realise.

    All data in an enterprise is created by function. All the business rules necessary to create quality data ought to be built into the function that creates it. In this way only quality data is created.

    Why do enterprises allow, indeed enable, bad data to be created and then spend so much time and effort trying to find and correct it?

    Regards
    John

    ReplyDelete
  2. John,
    Thanks for taking time to read and comment on the posting!

    I believe the answer to your question lies in the fact that, as you stated, "all data in an enterprise is created by function". This narrow scope leads to inadequate proactive analysis and design.

    It's hard to plan for enterprise data architecture when budgets are so tight and people's scope of the implications are so narrow.

    Thanks again for stopping by and commenting!

    ReplyDelete
  3. A good insightful article, with many points which i agree with.

    In my mind, at the core of the discussion is the 'ownership' question - my view is also that a centralized group made up of both business and technical stakeholders from across the organisation should 'govern'. Of course the larger the company, the more of a challenge it is to govern, however for large companies it is a key enabler to driving consistency and the integration of information across the enterprise.

    ReplyDelete
  4. Nigel
    Thanks for stopping by to read and comment on the post!
    I agree with the challenge being greater for larger organizations, however the payoff is also greater. The issue for large organizations is one of communication and awareness. It is more difficult for these organizations to have sight of the enterprise issues that are present in their data and it is also more difficult to communicate those issues and build consensus for a positive change.
    I look forward to more discussion, Nigel. Please stop by soon while I elaborate on address validation and how to implement an MDM solution.

    ReplyDelete
  5. William, this series looks like it will be real interesting. Our firm, DMTI Spatial, does a lot of work in address governance solutions. I can say that what you are describing is right on the money. And unfortunately, one of the major obstacles to implementing centralized address governance is that in big companies the "siloed" environment is almost always very real. If you like this topic you may find a webinar we are running this coming Tuesday of interest as its on the same topic. https://www1.gotomeeting.com/register/178662920

    ReplyDelete
  6. Alex,
    Thanks for reading and commenting on the post! I agree on the silo issue and believe it is the result of a lack of data governance awareness. If there was more awareness around how address data is a corporate asset, then a difference approach could be taken. I think it is an honest lack of awareness which requires more education and evangelism by folks like us!
    I feel positive about the future when it comes to this effort. In 5-7 yrs I feel like there will be more corporate data governance and MDM programs which include master address data management.
    Thanks again for stopping by!
    William

    ReplyDelete
  7. I'm surprised that no one has commented on the semantic issues, namely that many entities (especially business entities) have multiple business addresses for multiple contexts. You've described a very thorough process for validating an address, but where I have struggled is getting acknowledgement from a central organization (especially Finance) that their view of a customer is not the same as mine. Mine is more nuanced; and being in sales is more material since I have the direct customer contact. I'm the one who visits the customer; I'm the one who knows how and why they use different addresses; and I'm the one who has to explain the asinine 'business rules' imposed by a central organization who doesn't care whether or not the address matches business reality; just whether it obeys the rules. (OK, I'll step down from the soap box now)

    ReplyDelete
  8. Alastair
    I appreciate you stopping by and commenting. You brought up an interesting point about multiple addresses for a business entity. I think when people hear "master data they tend to think of "one" value. This is probably associated with the fact that the phrase "one version of the truth" is used so often during MDM initiatives.
    Fact is you can have multiple master addresses for a business entity as long as, like you indicated, there are valid reasons for doing so. It is not uncommon for a business entity to have separate shipping and billing addresses. It's also not uncommon for a business entity to have multiple shipping addresses. However this doesn't mean there cannot be master addresses. The key lies in the logic associated with why there needs to be multiple addresses and reflect that in the MDM hierarchies. Once this is established, master record per branch of the hierarchy can be implemented.
    A simplified example can be multiple shipping addresses per territory. Territory is the branch on the hierarchial structure that is managed in terms of master records. Therefore, master addresses are maintained per territory not per organization.
    One sentiment in your comment did strike a chord and that is that the governance committee that defines MDM rules needs to include people with the knowledge of the hierarchies that truly define business reality. This is why I like to drive an MDM/Data Governance project with hardcore data quality profiling and data discovery. The results of data quality profiling tend to shed light on the reality that you eluded to in your comment.

    ReplyDelete

What data quality is (and what it is not)

Like the radar system pictured above, data quality is a sentinel; a detection system put in place to warn of threats to valuable assets. ...