Building a Customer-Centric Data Warehouse

A major focus for retailers is getting a better understanding of the behavior of their customers. Learn how to build a customer-centric data warehouse from key analytic and data subject areas.

This article originally appeared on the BeyeNETWORK.

In a previous article, a method for putting oneself in the shoes of a customer was presented. Indeed, a retailer’s customers have far more information about them (the retailer) than the reverse. The organizations that strive to understand their customers are likely to be more responsive to customer demand, spend marketing and promotion dollars more wisely, leverage their multiple sales channels more effectively and ultimately drive more loyal behavior from their customers.  

The consensus among analysts, vendors and retailers themselves is that virtually all retailers have a desire to do a better job analyzing customer behavior. Constructing a customer-centric data warehouse for this purpose can be broken down into subject areas – analytics and data sources.

Analytic Subject Areas
Customer Segmentation – breaking down the customer base into segments by a number of different metrics. An analysis of customer quality (recency, frequency, monetary) is often central. Assessing certain metrics individually, such as time since last visit, frequency of visits, sales/revenue, returns and margin contribution of baskets purchased, can provide components to an overall customer quality score. Certainly location of the customer (regionally) and the demographics of the customer are important as well. In the absence of specific data about the actual customer, location-based demographic data (see the Data Subject Areas section) should be used to search for affinities to customer quality. Customer segmentation provides the answer to the question, “What characteristics describe my best (and worst) customers?”

Promotional/Advertising/Campaign Impact – assessing the effectiveness of advertising and price changes in influencing customer visits and purchasing. The basic question marketing wishes to ask is, “Are my marketing dollars being spent wisely?” Merchandising wishes to know if a price change encouraged more profitable or less profitable changes to the market basket. Combining data on actual promotions, advertising and media buys with transaction and foot traffic data provides instant feedback, albeit perhaps inferential feedback, on effectiveness. To the extent that direct marketing campaigns are run, it is of course easier to track their success given the typical promotion code approach, and the notion that the candidate customers are specifically known prior to the campaign mailing or outreach. Promotional impact analysis provides the answer to the question, “What kinds of offers are most interesting to my customers, and how can I improve the effectiveness of my offers overall?” 

Customer Satisfaction – hearing the voice of the customer. Most retailers interact with customers in places other than point of sale. Analyzing the feedback from surveys, call centers and online opinion sites helps marketing understand the issues that positively or negatively affect brand impression, shopping experience and overall satisfaction with a retail chain. Too often, this data is siloed and not easily attributable to sales performance. Traditional call center analytics can provide some insight, but leading edge retailers are starting to look deeper into the transcripts of customer conversations through text miningtechnologies in order to assess what customers are really saying when they take the time to pick up the phone and call a retailer. Customer satisfaction analytics should answer the question, “What qualitative and quantitative things do we do that make customers more or less likely to continue doing business with us?”

Loyalty/Account Program Performance – using data from loyalty/account programs to assess specific customer behavior. Most retailers claim they suffer from the lack of enough customer-specific information to do anything meaningful. However, many of them do have one or more of the following: loyalty programs, customer credit/charge accounts, gift registries, etc. Many retailers also have put in place “householding” algorithms to try to assign customer identifiers even in the absence of a loyalty program. For those that have these programs, it is vital to tie transactions and other customer interactions (website, call centers, etc.) to the customer’s identifier and use that link to perform key analysis. Comparing the set of customers who are part of a program versus not part of the program is a basic analysis to determine the value of the program itself, almost like a giant control and experimental group whose behaviors can be studied in the presence or absence of incentives. Loyalty program analytics should answer the question, “Are my efforts to retain customers and improve their relationships with us working?” 

The biggest roadblocks that retailers cite when looking to create a customer-centric data warehouse are:

  • Not enough customer-specific data
  • Poor data quality, or lack of believability in the data
  • Disparate data that can’t be matched for analysis 

Clearly, centralizing the data needed for analysis is the “horse” that pulls the analytic “cart.” Not surprising. Organizations that embrace proactive data integration, cleansing and stewardship programs have always been more successful in data warehousing than organizations that do not. If integration and quality were free and easy, everyone would do it. Still, setting a goal of customer-centric data warehousing is achievable. Phasing in the key data subject areas outlined subsequently will provide support for the analytics discussed previously.


Data Subject Areas
Transactions and Tenders – point of sale information. Access to basket contents and associated tenders is vital to the customer-centric data warehouse. If an item is promoted, it is nearly impossible to understand the true effectiveness of a promotion unless it is possible to understand whether that item sold more, the impact of the promotion on the typical basket contents and who responded to the promotion. Tying loyalty/customer information to transactions and their line items, along with the method of payment and (potentially) a link to a promotion constitute the basics. 

Customer Master, Loyalty/Charge Accounts – specific or general customer information. Creating and maintaining a “customer master” file is central, even if the notion of a customer is general versus specific. Feeds of loyalty program information, branded charge/credit accounts, gift registries, etc. are obviously very helpful in attribution of characteristics to customers. Customer relationship management (CRM) and campaign management systems are also sources of data that can be integrated to create the customer master files.

Promotions, Advertising, Campaigns – marketing and pricing information. Basics about promotions such as start and end date, theme, items on promotion and effective locations are often kept in places that require manual intervention (e.g., in spreadsheets or Access databases), but getting the data into the data warehouse can be relatively simple. CRM and campaign management systems can be treated like other source systems, with traditionalETL techniques applied to centralizing that information. Price changes as part of promotions should be tracked, although it is possible to leverage that information through the actual transaction details.  

CallCenterData, Surveys, Internet Postings – real customer feedback. Call center/CRM data contains notes fields where CSRs can document the qualitative nature of calls and richer content regarding how often customers call, their moods and other information that can hint at systemic problems that make customers unhappy. Along with survey data, this is “trusted” data, meaning that there is some fidelity to the relevance of the source and that is from actual customers or prospects. It is rich in structured data (customer attributes) and unstructured data (call notes) that can be mined for insight. Internet feedback, now much more common, can be a way to augment these data sources. It is not quite as “trusted,” because of the anonymous nature of the Internet, but can still be influential in a customer’s perception of or reaction to a retail brand and shopping experience. 

Demographics and Market Data – Third-party information about people and market share. Obtaining third-party data from the Census Bureau is a good way to understand the characteristics of a selling geography. The data can be obtained for free (in not-so-easy-to-use formats) or can be purchased after being refined by a third party. In the absence of any loyalty or actual customer information, this data can be a good way to understand the population characteristics near a retailer’s stores. Acquiring market share data from an aggregator can help a retailer assess its performance by category and market, and understand wallet-share information (how much of a customer’s discretionary income is spent with them or other retailers). 

Many retailers have parts of the customer-centric data warehouse outlined in this article. However, they have not taken the step of centralizing the data and analysis to achieve the multiplier effect of cross-silo analysis. The heavy lifting in achieving the vision is an organizational competency around data management and data integration that brings together the data outlined previously. With an enterprise data model and repeatable integration processes, a retailer can enable the customer-centric data warehouse through best-of-breed business intelligence tools, achieving all of this and more.

  • Dan RossDan Ross 
    Dan is the Managing Partner of the Retail Practice at Claraview, a strategy and technology consultancy that helps leading companies and government agencies use business intelligence to achieve competitive advantage and operational excellence. Claraview clients realize measurable results: faster time to decision, improved information quality and greater strategic insight. Dan is a frequent contributor to business intelligence literature, writing on topics spanning technical approaches and business impact, and the Claraview Retail Practice serves some of the world's most advanced users of retail data warehouses.


Dig Deeper on Data warehouse software