This article originally appeared on the BeyeNETWORK.
Have you ever found yourself trying to explain to your client or management team that the data mart they want you to “build in 90 days” is not going to meet their expectations? “Yes, there are articles or vendors that say that you can do it in 90 days,” you may say, and then add “but they do not say how well it met the business need.” Many of us have experienced this first hand. The following story is the real experience of an organization that received their data mart in 90 days—only to pay again to have it done right.
The client is a Fortune 500 manufacturer that is a leader in its industry. Its success is due to its constant desire to meet or exceed the expectations of its customer every time it delivers a product. This constancy of purpose has made it the dominant market player. Its customers have come to expect only the best, because this manufacturer usually delivers on its promise.
The manufacturer serves many very large commercial customers, each with very specific needs in terms of the products they desire. Each product requires strict adherence to customer quality requirements. Not completing customer orders on time can have a significant financial impact to its customers, therefore under-producing a committed product can result in severe fines which are included in its contracts. On the other hand, given the short shelf life of its products, over-production can result in excess product that must be sold at a discount, and even a loss.
Once production begins, some products may be redirected in mid-process to meet the needs of other customers, but most may only be sold as seconds or discarded altogether; either way, an expensive proposition.
The ability to forecast customers’ demand up to 18 months in advance, to translate this forecast into a production plan and a production schedule is critical to avoid under- and over-producing any one product at any one time. Also, this production planning process must ensure that the proper amounts of raw materials are available at the right time, at the right production line and at the proper level of quality—every day.
In addition, knowing that the production capacity will be exceeded 6 months in advance for a given line of products, allows the company to build a new plant, or add a new production line before the current capacity is exhausted. Proper short- and long-term forecasting and planning saves hundreds of thousands of dollars every day.
The Business Need
This demanding environment imposes pressing needs on the demand forecasting department to better know the needs of the customer and on the production planning department for better production plans and schedules.
As a result, the forecasting and planning areas have acquired over time several operational systems to support their functions. Some systems are home-grown, some are purchased packages. These systems are very complex, using sophisticated statistical models to predict demand and to project production for each product. Using these systems, the organization is able to determine with precision future customer demands and translate them into actionable plans for the plants and lines.
However, despite all these operational systems, the company still lacked the ability to perform trend analysis for some critical dimensions of their planning efforts. Two critical areas of trend analysis not well supported are adherence and capacity analysis.
Adherence analysis determines the accuracy of the planning efforts as compared with the actual production of a product at a time and place. Adherence analysis requires plan and production data integrated in a uniform manner to effectively identify areas of opportunity to improve forecast and plan accuracy.
Capacity analysis determines if the existing capacity is being used to its optimum (actual capacity), or if it will be sufficient to meet the forecasted demand (demand capacity). Capacity analysis requires consistent capacity factors and time frames to effectively determine where the shortfalls and the excesses will be.
The Original Project
Last year, a team was asked to build the data mart in 90 days; a time box chosen before the project began. It required: two weeks for analyzing the business needs; two weeks for designing the overall solution, four weeks for developing the databases and the data movement, and finally, four weeks for testing, loading and rolling out the solution. The participants were committed to the time and cost constraints and the project was delivered according to the guidelines: a “time box success.”
The new data mart provided excellent support for one critical function: standard deviation analysis. It was able to produce some reports based on the information moved to the data mart, but fell short in enabling the analysis for adherence and capacity. The solution had other shortcomings such as cryptic names (names that the business did not understand) and some of the data was not very usable for reporting due to differences with granularity of time, product or customer.
In addition, the reporting team had to perform too many operations to produce results. For example, missing data means planning for zero production or inventory; the data structure did not contain zeros in the missing planning segments and the reports had to compensate using unions. This resulted in a highly unreliable reporting environment where different results were produced by different programs using the same data.
Soon after the data mart was placed in production, the customer funded a subsequent project to address these shortcomings.
The Revised Approach
A new project was launched but this time the IT department took the leadership to develop a work plan with reasonable timeframes and with proper checkpoints to validate that the time was right for the tasks at hand.
The new design process was focused on the customer and the customer’s needs; not the “time frame” or the “cost limit.” There were time and costs constraints, but not as overriding concerns; there was an agreement to balance the cost and time with meeting the customer’s expectations.
The Analysis Phase
First, the team focused on understanding the business needs in depth; to know what the customer knew about their data and the processes that used the data. This step proved invaluable in subsequent activities. To gather the needs, we used a “data analysis worksheet” as shown in Figure 1. Data requirements were identified based on: (A) known reports or ad-hoc information needs mapped to potential sources; or, (B) known data sources already identified mapped to a specific reporting need. This process took nearly two months and involved business and systems personnel from all the source areas, target areas and the data warehouse area.
Figure 1 - The Data Analysis Worksheet
As data was identified, it was validated with the intended customers (report recipients and developers), and suppliers (source systems business and IT personnel); then, it was profiled (using column and relationship profiles) to ensure consistency between the actual content and the business expectations. This approach enabled the team to reduce new source systems from seven to four by reusing existing data in the staging area. Also, it enabled the reduction from more than 2,000 to close to 1,400 “entries” in the work sheet; a savings of 30 percent in subsequent work.
At the end of the analysis phase, all the participants were delighted to have a complete view of the entire subject matter for the project.
In addition to identifying, profiling and defining the required data, the team also developed operational definitions for critical business concepts using the business concept definition process. It is important to clarify that these are not “conceptual definitions” or “documentation.” Operational definitions enable all customers and suppliers of the information to have a common and consistent understanding of the information. These definitions drive the data and process design; they are the centerpiece of the data mart. Figure 2 illustrates an example of a business concept.
Figure 2 - Definition of "Demand Forecast"
Also, the analysis included a “post mortem” assessment of the previous effort. This was not to place blame or fault, but to understand the root causes of the shortcomings. To be able to answer questions such as “why was capacity analysis not possible in the existing data mart?”
In doing this analysis the team discovered several critical design considerations, the following are two examples:
- Must develop a time management strategy that addresses the disparate time granularities from the different systems. Also, must resolve the “variable calendar” into a “flat calendar” to use consistent time intervals in the analysis. A variable calendar means that one plan or forecast can contain different period durations such as weeks for early periods, months and quarters for later periods—all in the same series.
- Must develop a strategy to avoid the need for “unions” and other techniques to have data for reports and analysis that is not present in the source systems; the business rule is “missing data for a planning period is planning to produce or inventory zero product.” This business meaning must be reflected in the design.
Based on these findings, the data analysis work sheet, the critical business concepts and the critical design considerations, the team proceeded to the design phase. Reaching the point for having the necessary knowledge on the business needs took nearly two months.
The Design Phase
First, the team focused on the data design. What should the data structure look like to provide the foundation needed to provide adherence, capacity and other analysis? We identified the following critical design considerations:
- The business is run on a weekly calendar; months begin on Sunday and end on Saturday based on the weeks of the month. Quarters and years do the same. So the best way to present the entire forecast, plan and actual production is … you guessed it, in weeks. So the team decided to transform all time information to weeks.
- When there is not an entry for a week in a plan or forecast means that the forecast or plan calls for zero production or inventory. The new structure must ensure that all the “planning weeks” associated with a production week are in place even if no data is ever received by the data mart for the week.
Armed with these two design considerations, the team developed several alternatives and decided on the following design: it created the “production week” as the central fact. All the analysis is about the production week. In this table, information is collected about the capacity factors, the actual production and the budget (one per instance, see Figure 3).
Figure 3 - The "Production Week" Fact
The “planning week” was also created to bring together all the data about forecasts and plans for the production week (see Figure 4). The planning week is the week when the plan is created for the production week.
Figure 3 and Figure 4 show that a “fact-dimension” design was chosen. In this design what is a fact in one star (Figure 3) is a dimension in another star (Figure 4).
After undergoing several design reviews with business, systems, data warehousing, and DBA personnel, it was agreed that this design met the business needs and was workable.
Figure 4 - The "Planning Week" Fact
Now attention was turned to the design of the data movement. With a clearly articulated data design, a process to create the “place holders” for these two data structures was needed. To ensure consistent creation of the placeholders, as part of the extract processes, the process creates triggers from each of the incoming files that contain data for the data structure. These triggers drive the create process for the production week and all its associated planning weeks. In this fashion, whenever a forecast, a plan, or any other activity comes in with the need to create a “production week,” this process creates it the same way all the time with all the placeholders set to zeros. Once the structure is created, then the files from the operational systems are used to update the corresponding placeholders. If no data comes in, the placeholder is already in place. This meant no more inaccurate pivoted tables or conflicting reports. The design phase took two months to complete.
The Development Phase
Before a module was ever developed, a detailed specification was created and reviewed by the business and systems personnel in the source and the target areas to make sure that the data was being acquired, moved, transformed and stored correctly. The system consists of 50 modules (or programs) that execute the data movement. Each module has a two to six page specification (Figure 5 illustrates the template). Before a line of code was produced, the team was able to validate the process, identify potential weaknesses in the design and take corrective action to ensure data integrity.
Figure 5 - Data Movement Specification
Once coding began in earnest, there was a complete picture of the entire system, end-to-end. Even though changes continued to be made to the data structure and the module specs, none of these changes were fundamental; the design held as originally planned. In fact, when deficiencies were identified, the well thought out design was easy to change and transparent; that is, it was very good at showing its capabilities and impacts. This made the refinements easier to validate.
The development phase took a little more than two months. It was followed by systems integration testing, data loading and production roll-out. All in all, the project duration was 10 months.
The customer found the new design so well aligned with their needs, that they redirected existing reports from their operational systems to be created from the data mart. They are now able to analyze their capacity and adherence for 18 months into the future and for up to 3 years in the past with a high degree of confidence on the results. This is the result of:
- Allocating the appropriate amount of time and money to achieve the desired result without artificial constrains. This included following the plan and changing the plan only when needed and in full collaboration with all the participants;
- Having a shared understanding of the business needs;
- Developing a data structure and a data movement design aligned with the business need; and
- Developing comprehensive edits to prevent data of unacceptable quality into the data mart in collaboration with the customer.
It may be possible for some data marts to be developed in 90 days; however, I am yet to see one that achieves the desired objectives. More often, as in this case, the customer ends up realizing that to develop the infrastructure required to meet their information needs and to last for a reasonable number of years the “90 days” approach is just a fable. The best way to describe the results achieved by the revised approach is by the customer’s own remark: “this is the best data mart at our company today.”