This is a two-part article:
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
Columnar database software, other technologies aim to speed analytics
Analytical database tips: deployment advice and potential challenges
The first part of this article discussed columnar database software and other technologies that can help boost analytics performance. But regardless of which analytical database technology you use, its positioning within your data warehousing and business intelligence (BI) architecture should take at least one of the following approaches.
First, analytical databases that are used to process queries typically should be set up as data marts, not as enterprise data warehouses, or EDWs. Split your architecture into an EDW that handles the back-end data integration tasks and a data mart layer where the front-end analytics processing is done.
You don’t need the new crop of analytical databases that have become available in recent years to power your data marts, of course. But the relational databases that dominate the transaction processing world tend to require a lot of resources and IT staff time to meet BI performance needs – and even with that kind of investment, they often still fall short of what the business requires.
To avoid the BI shortcomings of relational software, many enterprises have used online analytical processing (OLAP) databases as their front-end analytics tier. Columnar software, massively parallel processing technology and other modern analytical databases are the next step in the evolution of the front-end architecture. They’re generally more scalable and extensible than OLAP tools are, potentially enabling organizations to build a single “enterprise data mart” instead of multiple marts.
The second architectural approach for analytical databases is to establish a closed loop in which the data generated by predictive analytics software, data mining tools and other advanced analytics applications is fed back into the EDW and your data marts. That kind of loop makes it possible to capture information such as business forecasts, possible future scenarios and marketing campaign analyses and then share it across the enterprise.
Analytical database challenges and potential drawbacks
But as with any technology, there are challenges that you’re likely to encounter. The first often is convincing IT managers and staffers to use analytical databases in the first place. Most companies have been relying on relational databases for years to support their enterprise applications and data warehouses, and IT’s experience and expertise typically rest with relational software.
As a result, the next challenge may be finding workers with analytical database experience and skills. This is a common issue with emerging technologies, and organizations should establish a staffing plan to hire or groom the talent required to deploy, manage and support these databases. But companies likely will need to rely on consultants or database vendors to help kick-start their projects.
The danger of not getting skilled people in the beginning is that an analytical database won’t be implemented correctly, either making it difficult for business users to get what they need or resulting in poor database performance – or, perhaps, both. That can lead to frustration with the new technology and thwart a successful deployment.
Another challenge is setting realistic expectations on what the products can do and how long it will take to develop working systems with them. Too often both IT and the business get carried away with slideware and industry hype. Remember that an analytical database is only as useful as the data that goes into it. If your data isn’t clean and consistent, or if the data that’s required isn’t available, you should make it clear that data integration and cleansing work will be a significant part of the project.
The biggest potential drawback of analytical databases is product lock-in: Once you’ve implemented one, the only way to replace it is to rip it out and start again. That’s especially true because of the proprietary nature of the technology and the fact that application logic is embedded into the databases to provide scalability and boost performance. But the reality is that enterprises get locked into many IT products. When evaluating and selecting analytical databases, weigh the advantages they can provide versus the lock-in downsides. In most cases, the potential return on investment more than justifies the risks.
Third time’s the charm with analytical database software?
To sum things up, the first generation of BI and analytics systems were EDWs based on relational database technology. That approach hit the wall when it became difficult to produce a single data model to serve both the analytical needs of business users and the data integration demands of getting data ready for analysis. In the second generation, enterprises built data marts with either relational or OLAP databases to handle their BI processing. Those systems worked well for the initial scope of analysis and the number of business users performing that analysis.
But they also have started showing their limitations as BI and analytics have become more critical to business success, and as many organizations have moved toward pervasive BI deployments involving more of their end users. The new generation of analytical databases can help provide the BI and analytics capabilities that are essential to organizations, as long as you design and deploy them properly.
About the author: Rick Sherman is the founder of Athena IT Solutions, a Stow, Mass.-based firm that provides data warehouse and business intelligence consulting, training and vendor services. In addition to having more than 20 years of experience in the IT business, Sherman is a published author of more than 50 articles, a frequent industry speaker, an Information Management Innovative Solution Awards judge and an expert contributor to both SearchBusinessAnalytics.com and SearchDataManagement.com. He blogs at The Data Doghouse and can be reached at firstname.lastname@example.org.