Sure, you can build databases and applications without a data model. You can also build a house without an architect...
or blueprint. Neither is a good idea, according to experts and common sense.
Yet many organizations don't bother with conceptual and logical data modeling, according to Pete Stiglich, senior consultant with Hinsdale, Ill.-based EWSolutions and SearchDataManagement.com's data modeling expert. Pressed for time, money and staff, companies charge forward with database and application development, only to learn later the costly perils of skipping data modeling processes.
Part of the problem may be a general lack of understanding about the discipline, such as appreciating the difference between conceptual data modeling and physical data modeling or other modeling activities, Stiglich said. Physical data modeling, or defining physical data structures, is essentially required to build a database. Typically, companies don't skip this step, but it's frequently underestimated as just a technical task of the database administrator or developer. A conceptual data model is a tool for understanding the business from a data perspective, rather than modeling a system, Stiglich explained.
"For example, if you have entities like 'customer' and 'order,' how do those relate to each other?" he said. "Can there be many customers on one order or only one customer per order? [Conceptual data modeling] is a way to understand the business from a data perspective, so you can then take that conceptual description and apply that to a particular solution. You can have a conceptual data model that can be applicable to many applications."
The process of creating a conceptual data model also helps organizations uncover and define common data objects and relationships, such as "customer" and "order," which might be used in multiple applications. Many organizations fail to take this business-centric view, though, focusing instead on physical data structures only, Stiglich said. Ostensibly to reduce project delivery time, they skip conceptual data modeling, which should be done in the requirements and design phase. But cutting this corner does not pay off in the end.
"You can miss a lot of requirements by not developing that conceptual data model," Stiglich said. "That can have a dramatic impact on delivering the project on time and on budget."
One company that Stiglich worked with learned that the hard way. A company providing pet care services skipped the conceptual data modeling step, which led developers to incorrectly make assumptions about the relationship between "customer" and "pet." The system they put in place assumed that "customer" and "pet" had a one-to-many relationship -- i.e. a customer can have many pets, but a pet (supposedly) belonged to only one customer. Since each member in a household could be considered an owner of a pet, the pet record had to be duplicated when someone other than the original customer in a household brought the pet in for services. That created data quality problems and reporting problems and -- worse -- negatively affected employees' perception of the new system.
Despite stories like these, some organizations may still need to be convinced that the time and resources required for data modeling are worth spending. Most of the costs are in employee and project time, though technology such as data profiling tools can also support the process. The long-term benefits make data modeling worthwhile, Stiglich said.
"It's easier, and it costs a lot less, to fix something up front in the requirements and design phase than once it gets into development and construction," he said.
It can be very expensive to fix problems caused by poor data modeling after the fact, he said, especially if the core structure of a database is affected. The problems caused by a poorly developed system can ripple through an organization, propagating data quality problems and skepticism over data accuracy. Worse, if a problematic new system is a source for data warehouses or business intelligence applications, it can sully decisions and insights derived from those systems.
Estimating the ROI of data modeling
There are specific project costs that can be affected by doing -- or not doing -- data modeling, according to Steve Hoberman, Westfield, N.J.-based consultant, trainer and presenter, and author of Data Modeler's Workbench and Data Modeling Made Simple. These are helpful to investigate when building a business case for any data modeling activities.
- Data quality can be seriously affected, so potential costs related to poor data quality are important to examine and include in a business case.
- Support costs can be affected because systems are more difficult to support without a data model.
- Training new staff on systems is easier with a data model -- or more time-consuming and costly without one.
- Integration costs can be affected because many systems will eventually need to be integrated with other systems -- a process that's much easier with documented data models.
To help put the importance of data modeling in perspective for businesspeople, Hoberman has used two approaches successfully.
"You can draw the analogy to an architect. What happens if you create a building without a blueprint?" Hoberman said. "Another approach you can take is instead of focusing on why you need a data model -- think of the things that a data model gives you, like better data quality and common data definitions."