Database technology may be getting faster and faster, but that is no reason to avoid creating a data model. In...
fact, says data modeling expert and consultant Len Silverston, data modeling techniques may be more important than ever in the age of columnar databases and agile development. Silverston, the author of The Data Model Resource Book and a well-known speaker at data management conferences, says things move so fast these days that failing to create a proper data model can lead to serious integration woes in the future.
SearchDataManagement.com recently caught up with Silverston to talk about how agile development methodologies and the growing popularity of superfast databases are affecting the world of data modeling. Silverston also offered a brief overview of data modeling techniques and discussed why the main reasons for data modeling oftentimes come into conflict with one another. Here are some excerpts from that conversation:
For the newbies out there, what exactly is a data model?
My view – and there are many views – is that a data model is a graphic representation of the structure of the data. There are three purposes to data modeling: One is to understand the data requirements. The second [purpose is that data models are used] as a foundation for the design of a database implementation. The third purpose is integration. It allows data to be integrated. What is quite interesting is that [the first two] purposes are often at odds with each other.
How do the goals of understanding requirements and building a foundation for design come into conflict with one another?
When you build a data model to really understand the requirements, you tend to be specific. [You would] listen to the requirements and put it down in a very specific format. When I build it as a foundation for the design of an implementation, I’m going to build in flexibility, a generalized model, and build in features. It’s kind of like if you’re building a house. If you talk to the owner, the model might look like a model house. But when you’re the architect and talking to the builder, you’re going to want technical blueprints and specifications.
What is a ‘generalized’ data model?
An example might be if you’re doing a data model for a customer system and the business representative [wants] the email, phone number and fax [information for these customers]. In a very specific model, where I understand these requirements, I might have an entity called ‘customer, email fax and phone.’ Now, if I implemented that, I could miss out on all the future requirements. In the future, the [customer] might have two faxes, two phones … a Twitter account, a LinkedIn account and a Facebook account. So I might build a data model that says, ‘Here is a customer that has many contact mechanisms of different types.’ I generalize the model to store any type of contact information, even [if] it comes in the future.
What is agile development?
Agile development is nimble development. [It’s] completely focused on business requirements and [developing something of value to the business in very short bursts of time]. These quick delivery cycles in agile development are called sprints. It’s great because it [shines a light on] the idea that we really have to deliver in IT and deliver business solutions. It also talks about [how] in the past we did these long requirement documents and long design documents. [Those documents change frequently], so let’s just get the basic idea and get right down to it and deliver something. We can deliver it, and if it changes later, we can change the whole thing anyway.
Should organizations apply data modeling techniques in an agile development environment?
This is a huge topic for data modeling. In some agile development methodologies, there is no mention of modeling at all. In others, they have steps to do it and to do it quickly. The reason we do need to model in an agile environment is the same reason we always needed to model. [A data model] is a great thing to be able to produce quickly, and [they often] need to be produced quickly. But if we don't understand the data and if we don’t develop systems with good infrastructure then [two things could happen]: One is we would have very poor designs. For instance, the design wouldn’t accommodate future needs, and [with that], maintenance becomes a big issue.The second problem, which is even bigger, is integration. In our world, integration is such a huge issue.
How can failing to model in an agile development environment potentially hurt integration levels?
In an agile development effort, there is a mindset that this is a tight-knit team working in close proximity and reaching out to one another and it’s good. However, what I see missing is this link to the integration. In some agile development environments they have that; but in a lot of textbooks, they don’t mention data modeling at all.
What is your reaction to those who believe that -- in many cases -- things like columnar and NoSQL databases are so fast that they negate the need for data modeling techniques?
That is another hot topic. A couple of weeks ago, I went to the [TDWI Executive Summit] and went around to some of the booths and was talking to some of the vendors. They were saying, ‘You know, with our databases, you don’t really need to model.’ Well, I don’t really buy the argument. It’s been a fact with a lot of these databases and [agile development], that it’s to some degree even more important to data model because things are happening so quickly. In order not to make a mess, we need to understand how [everything] fits together. We need to understand not only the overall data model but the specific requirements and the specific data structures. So this argument that there is not a need for a data model, I don’t think it’s true.
What is data modeling?
Learn more about data modeling techniques in an agile development environment
Read about Len Silverston’s reusable data models