Sacred cows of using data warehouse software – and how to tip them

Sacred cows of using data warehouse software – and how to tip them

Date: May 05, 2011

When it comes to using data warehouse software and crafting a data warehouse strategy, consultant Evan Levy thinks that certain ideas and approaches have attained the exalted status of being viewed as absolute truths, or sacred cows – beliefs that can’t be questioned and are beyond reproach. Some prime examples that he cites: data warehouse technology is a commodity, and companies need to stage information in an operational data mart to be able to do time-sensitive reporting.

But such absolutes and constants aren’t necessarily that at all, according to Levy, a partner and co-founder of Baseline Consulting. In a video interview with SearchDataManagement.com, Levy said IT, data warehousing and business intelligence (BI) professionals should challenge conventional thinking about data warehousing projects within their organizations. To ensure that your data warehouse systems don't become outmoded, Levy said it's time to move forward – and to tip data warehousing's sacred cows.

In the 11-minute interview, viewers will:

  • Hear Levy's take on what the sacred cows of data warehousing are and why businesses should question them
  • Find out what data warehouse and BI managers should consider when evaluating data warehouse software and architecture options
  • Learn how to ensure that a data warehouse strategy is the best fit for their organizations and doesn't become outdated
  • Get tips on methods for effectively challenging perceived data warehousing sacred cows

Related resources on data warehouse software and strategy:


Read the full transcript from this video below:  

Sacred cows of using data warehouse software – and how to tip them

Craig Stedman: Hello. I am Craig Stedman, Site Editor of
SearchDataManagement.com, and I am covering the TDWI World
Conference 2010 in Orlando. I am speaking today with Evan Levy,
a partner and co-founder of Baseline Consulting, about data
warehousing technology and project management. Thank you for
speaking with us today, Evan.

Evan Levy: It is a pleasure to be here.

Craig Stedman: You are doing a session here on "Tipping the Sacred Cows of Data
Warehousing". Data warehousing is not really that old, compared
to other aspects of IT. Are there really data warehousing sacred
cows at this point?

Evan Levy: It is really interesting. I did some research on what the term,
'sacred cow' meant and where it came from, and actually comes
from the Hindu religion, it came into existence around the turn
of the 20th century, back in 1905, I believe is when it came
out. It refers to the cow, because it is such a sacred figure in
that part of the world, and the whole idea behind it is you
cannot challenge something because it is perceived to be a
constant or something that is just beyond reproach. The reason
for that term of that concept is data warehousing has been
around enough time that there are certain rules or assumptions
that people make that I think we actually should challenge.

One of the things to keep in mind with any technology is nothing
is static, nothing is forever. If you can take a look at PCs,
the whole idea of having webcams built into PCs, being able to
watch movies and having terabyte drives in a laptop was not
really reasonable four or five years ago. How many people would
have thought they would have a GPS built into their phone, much
less have it in their car now? The world of paper maps has
disappeared. Lo and behold, we are in this world of data
warehousing, which has evolved an enormous amount.

The idea of having 20 terabytes was not reasonable 10 years ago,
probably not affordable 10 years ago. It is practical in this
day and age, and a lot of the technologies, architectures and
ideas that data warehousing and business intelligence is based
on dates back to the '90s. If you take a look at some of the
early writings, people said, "you would have a centralized data
warehouse. And then they said, "Oh but wait a second, I cannot support all the queries and
the loading, so I need to have data marts." Then people said,
"Wait a second, if I have data marts then I cannot get the time-
sensitive data that is operationally-oriented so now I need to
have an operational data mart." And then, " Boy, there is an awful lot of
challenges when it comes to ETL processing, loading, and sorts,

so I need to have staging areas." The reason that we are tipping
the sacred cows of data warehousing, is not to say that data
warehousing is old, or data warehousing is no longer relevant --
far from it. In fact, it is to reexamine why do we have some of
the beliefs that we have? Are they still appropriate? Do we
still need data marts, operational storage and staging areas
and use them the same way?

The technology has come a long, long way in very few numbers of
years. Things like mixed workload is a commonality now. Parallel
processing is a commonality now. Very inexpensive storage and
ETL tools are very common and just-in-time ETL is very
practical. One of the reasons for tipping these sacred cows is
to actually challenge the paradigm. I sometimes joke that when
we are dealing with some of our clients, some people are very
comfortable in what they are doing, and they do not want to
change. If you remember the advent, or the introduction of data
warehouses in most companies, we were challenging the paradigm.
In fact, there were people that said, "I do not need ETL tools;
I can write everything in COBOL. I do not need a data warehouse;
I have certain types of reports already." I do not want to see
business intelligence and data warehousing turn into the COBOL
program of the new millennia. I think there is enormous value in
data warehousing, and there is enormous value in analytics, and
I think we, like any other piece of technology, technology user
or environment, needs to grow and evolve.

One of my favorite ones is where people say that third normal
form is something that operational systems are built in, or you
need to have an operational data mart to be able to do time-
sensitive reporting. I do not know that there are any absolutes
anymore, particularly when it comes to data warehousing. I think
where you see the BI tool vendors actually deploy against
operational systems, where you see things like virtualized
databases or virtual data marts exist, being able to merge data
on-the-fly from certain operational systems actually begs the
question, "Do we need a big centralized data warehouse? Is that
better than other data marts? Are managing multiple data marts
less efficient than a big data warehouse?" Some of those things
actually came into existence because of the limitation and the cost
of the hardware and the different vendor products.

In this day and age we can buy data warehouse appliances now, so
the complexity of the systems is dramatically reduced, and our
maturity and knowledge is dramatically greater, so, yes, it is
time that we do challenge some of these paradigms, and the
purpose of the class is not to look for absolutes or good and
bad; it is to understand maybe some of the assumptions we have
made are not the most effective use of our money, maybe it is
not the most effective use of our time. Let us take a good,
solid look and just come up with, maybe, new and creative ways
of solving the problems, or in fact, maybe stop working so damn
hard.

Craig Stedman: What are some of the data warehousing sacred cows that you plan
to tip?

Evan Levy: One of the things that we have seen is there is an enormous
proliferation of data marts, and a data mart is a great idea if
I have very specific content: I want to be able to deploy
reports and details to a particular audience of users -- that is
great. Originally, data marts came into existence because there
was an assumption of, if I have a big data warehouse, it could not
possibly support end users doing analytic queries, end users
doing very simple queries, and all the ETL processing that's
necessary. In fact, the technology has come a long, long way,
and I do tell my clients, "Let us challenge the paradigm. Why
are you not putting some amount of query processing on your data
warehouse? Why can you not consolidate data marts?"

In fact, we are seeing companies like Oracle, IBM, Teradata,
Sybase and others challenge the paradigm of having lots of data
marts and a centralized warehouse, saying, "You know something,
maybe we can have fewer, more consolidated systems, reduce manpower,
increase business value access, and it is very feasible." One
of the other ones that we have challenged is the assumption that
every data warehouse is a commodity. In fact, I think we are
still many years from that; the assumption that a parallel DBMS
is a parallel DBMS is nuts. Every different vendor's product has
unique capabilities that are different from some of the others;
there is no silver bullet. The way that many of the vendors that
implement it, whether it be their appliances or their high-end
parallel processing engine is dramatically different. To would
assume that one vendor can do everything, or in fact, to assume
that the way that stuff used to work is the best way for it to
work now, I think it merits some argument, some discussion, to
understand what other individuals are doing. I find that people
use technology in ways we never would have imagined.

I was recently at a Microsoft event where Donald Farmer actually
showed how business people built applications in Excel, clearly
challenging what spreadsheet functionality was. They are
building analytic applications in a spreadsheet tool. Does that
replace a data warehouse? Probably not. The whole idea with
ingenuity and technology is that we can accomplish some amazing
things. I think we should challenge anything that is perceived.
There are other beliefs that we have heard for a number of
years, staging area versus direct access from a source system.
If a source system is a DBMS why do I want to take data
persisted in the storage area before I load it? Maybe I can
just do it on the fly. There are tradeoffs. The belief that one
is always better than the other in an approach is not safe, and
we actually debunked some of those myths.

We also talk about storage area networking, data warehouses
appliances and a few other technologies from a perspective of,
"Wait a second, are we really attacking a myth? Are we attacking
a need? Is the most effective use of things?" My favorite
example of just sacred cows is the electric can opener. You can
have an electric can opener. The fact is, I have a manual can
opener because an electric can opener, I do not save time, I
actually can put it away and take it off the counter. I am
perfectly happy with a manual can opener, and most people have
realized that. Years ago everybody went to digital display for
their watches, only to go back to standard analog display of
watches. My family is very fond of an electric carving knife. I
do not know about you, but I can use a normal knife, and I am
not exhausted after slicing turkey. In fact, I find it is a lot
easier to use a manual knife than an electric knife.

I use those examples because I think we have just made some
assumption that if I add more power or if I throw more
automation at it, or if I use electricity it is going to be
better. The problems that I had a few years ago, or 10 years
ago, are not the problems that I have now, and the technology
and the capabilities have just evolved enormously, so the things
that we thought were constant or basics -- I think it is worthy of
a discussion.

Craig Stedman: What can people do about some of these sacred cows?

Evan Levy: One of the things that I think is important to keep in mind is there
should not be anything that is sacred. I clearly do not want to
build everything from scratch, saying, "Hey, you know we got all
these great tools. Let us throw them out and build our software
from scratch." I think it's important to understand: what's the
most precious commodity in an environment? Sometimes in data
warehousing, I do not have any more storage or I do not have any
more processing, so that is really the most sacred resource. The
other instance is where people's time is the most sacred
resource. When you have methods or activities that are going to
challenge where you are running up against a limit, challenge
the sacred cow, challenge the obvious rule, not to be difficult.
But in reality, if doing it the same old way costs more, takes
longer, and is not likely to be successful, raise your hand,
start the discussion. Everything should not be an argument;
people should not die, or make every issue dying on the hill. It
should really be about, 'Why is this our policy? Why is this a
requirement?'

We were talking with a class, a participant in today's class
where someone said they have been told that they need to move
all data, regardless of size, across their enterprise service
bus because the company has made it standard. Enterprise service
buses do a lot of things, but they are not typically designed to
support an infinite amount of data with an infinitesimal amount
of response time, with little or no interaction. That is not why
they exist, so to come up with a rule or a policy that says, "If
you move any amount of data, always use an enterprise service
bus, it is a little disturbing, because there are no absolutes.
The cost of technology and the cost of time changes, so do what
makes sense. What can you do? Engage in a conversation, do not
make it a do-or-die circumstance, do not make it a catastrophe, but
enter a conversation to ensure that at least people are thinking
through, "Is this really the right decision for the
circumstances that we are in?"

Craig Stedman: One final question. When you tip a sacred cow, should you tip
it away from you as opposed to tipping towards you?

Evan Levy: Always tip a cow away from you, and usually run. Make sure it is a
cow and not a bull.


Craig Stedman: That is very good advice. You can find more practical
information and advice, as well as the latest news about data
warehousing and other data management topics on
SearchDataManagement.com. Thank you for watching.

More on Data warehouse software

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: