THE STRATEGIC IMPACT OF POOR DATA QUALITY (Part 1)
By
Eugene Reagan
Senior Consultant
This is part one of a two-part article...
Overheard Conversation No. 1:
“Whatever happened to that systems integration project that was supposed to be finished this year?”
“It died a slow death due to data quality problems.”
How does data quality have this kind of impact? Why does it continue to bedevil major strategic initiatives in company after company? Most importantly, what can be done to address this ongoing and growing problem?
At insurance companies around the world, systems have historically been developed to support a particular user or to handle a defined function—in other words, to address a specific need. This was efficient, in terms of both systems development and transaction processing. The result, however, has been a “silo effect,” with systems unable to speak a common language. This has complicated the already complex problems that result from poor data quality within the systems themselves.
In order to grow their revenue bases, insurance companies are using customer relationship management (CRM) systems, data warehouses, and data mining techniques to manipulate existing customer data to market new products. These initiatives can be extremely cost-effective revenue growth strategies. However, the data involved must be accurate, complete, and usable.
Often, the discussion of data quality is limited to the common problem of differing definitions of the same term in different systems. This causes difficulties when the data is integrated into a new database or warehouse. However, the real underlying problems resulting from poor data quality are much deeper and widespread.
These problems fall into three broad categories:
First, data quality problems can be symptoms that point to underlying defects in existing operating processes.
Second, data quality problems highlight current system problems that cause errors and extra work while reducing the effectiveness of the system itself.
Finally, when data is combined or moved from one system to another, quality problems can arise and be compounded. This last situation is common with the transition to many new database technologies.
Obviously, there is no “magic bullet.” No single solution addresses all these possibilities. What approach, then, should be taken?
While the symptoms of data quality problems are varied, they can be attacked in an organized fashion. Recall the broad definition of quality: “meeting or exceeding customers’ expectations.” When discussing data quality, this definition is complicated by the fact that there are at least three sets of customers—information collectors, information custodians, and information consumers. These customer groups and their issues may or may not overlap, and they may have completely different priorities.
Information Collectors
Information collectors are the individuals and units who create, input, or otherwise manage the capture of data. They normally operate the systems and are often the business owners of the application systems. Frequently, they input data for use in transaction processing. Collectors usually have a strategy built around operational efficiency. They want the data to be formatted so that it can be input quickly, easily, and uniformly. They also want it to be complete for all transactional purposes, including customer service responses. As transactional users, they want easy access with a minimum number of security obstacles.
Information Custodians
Information custodians are the information manufacturers or processors. They are primarily systems staff responsible for maintaining the data and the systems infrastructure.
Custodians have quality expectations built around technical efficiency (for example, storage, access speeds, and programming ease), maintainability, and security. Often, there is little real understanding of the meaning of the data. The custodians are largely interested in keeping the data available and secure for the information consumers.
Information Consumers
Information consumers use the end information product in their work. Consumers may be operational staff who use the data from one system for transactional activity, or they may be executives using data from many systems to make strategic decisions or execute broad initiatives. Consumers may be the transactional users who use one system to perform one task (for example, processing and underwriting applications through a new business system. Their data quality requirements have normally been defined during the development of the primary system.
Many data quality issues arise from a different type of information consumer—one who creates a new function that does not currently exist and requires data from multiple systems. A typical example is a marketing initiative that will need to process customer information from separate systems that handle different products or transactions. Since these systems were developed for different purposes, there has been no effort to standardize format or definition.