Home > Data management / BI Tips > Data Management Insider > Rogue data management
Data Management Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 

DATA MANAGEMENT INSIDER

Rogue data management


Alex Kerezy
06.01.2006
Rating: -4.00- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


Every company or organization has some of its key data assets managed outside of its business systems. While most companies are permeated with "rogue data", this type data is often difficult to find, and can be even harder to manage. This article discusses an approach to identifying and managing this data, and provides some insight into some of the challenges associated with rogue data.

One of the more interesting businesses areas I've worked in was an area where there were a large number of statisticians and analytical business users. As I started to take scope of exactly what data there was to manage, I discovered a rather large amount of what I call "rogue data". I consider this data that is maintained outside the control of business systems, aka spreadsheets, and perhaps the most famous database of all, MS Access. Now let's admit it, every company has some of this, but in this case it seemed pretty excessive, and it was in the hands of a rather highly analytical group of professional people. Many of the people in this group who could probably teach an advanced course in MS Excel, and could query a database better than most of the IT staff.

One particular incident I remember was one of these business users testing an interface built for an application. He requested that the data be exported into a comma separated value (csv) format so he could import it into a spreadsheet and 'test it'. He went to work like a concert pianist carefully and intensely pounding the keyboard, manipulating the data as only an expert could do. After about ten minutes of "ahhhhs"' and "hhhhhms", copying data to several tabs in the workbook, multiple pivot tables, VLOOKUPS, and other formulas, and maybe even an incantation that even Harry Potter couldn't conjure up, he proudly declared that a certain data was missing from the export and that certain expected totals were short. Not exactly what you would call a repeatable test case, but you can't get a show like that just anywhere


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


RELATED CONTENT
Data Management Insider
Data governance committee project pays off for Blue Cross
Business intelligence vendors react to SAP's plans to acquire Business Objects
Gartner MDM Summit: Mud wrestling with master data management

Data stewardship
Data quality software, including dashboards for non-IT users, gaining traction
IBM acquires data discovery vendor Exeros
How to estimate customer data cleansing costs
Cleansing data after an acquisition often calls for data quality software
Customer data integration helps credit union personalize customer experience
Exec explains IBM's Information On Demand (IOD) initiative
Business intelligence, data mining maximize direct marketing for Intellidyn
Creating successful data stewardship programs, with Jill Dyché
Gartner's data quality management software rankings show convergence with data integration
Informatica to buy identity resolution software maker for $85 million

RELATED GLOSSARY TERMS
Terms from Whatis.com − the technology online dictionary
data  (SearchDataManagement.com)
data governance  (SearchDataManagement.com)

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


.

So how do you manage rogue data? Well this article provides some guidelines that should help your organization "bring in" those outlaw data rustlers, and maybe even win them over to "the Force".

SCOPE
It is important to make some determination of the scope and nature of this type of data in your organization. You'll need to be a little clever here as you go about this. If these outlaws think you're out to take their spreadsheets away, they may become somewhat opaque. (Sssshhh, there could be one in the cube right next to you) You also have to realize that to some extent this will be a "guesstimate", and an effort you periodically repeat.

You can learn a lot about the rogue data in your organization from the people who build reports, administrative staff, people knowledgeable with interfaces, and finally from the business users themselves. You may be able to look at database logs, or other sources that can point to routinely run queries to help identify it.

Just as important as identifying rogue data is "guesstimating" its turnover or life cycle, and frequency of use. You'll usually find a handful of these things corresponding with a large system implementation because of all the out of scope items that the business users determined they would do themselves. So you go about building a list of this rogue data, its source(s), the type of data it is from a business perspective, its frequency of use, when it started being used, and how long it might continue. This just helps you understand the scope of the problem.

APPROACH - GENERAL CONSIDERATIONS FOR ROGUE DATA MANAGEMENT
There isn't a cookie cutter approach to managing rogue data because every organization is different. You should work to develop both a strategic and tactical approach that fits into your over all data architecture. Your goal should be to provide the best solution to manage data and meet business needs, not to globally stop all spreadsheets and desktop databases. A positive approach is a winning approach. If you carve out "Thou shall not use spreadsheets to manage business data" on stone tablets, your success is likely to be limited and short lived.

As you go about implementing your plan for each data set, you'll also need to determine what the best "political approach" is. Some solutions are easy sells, or "win-win" scenarios, while other solutions may challenge someone's domain or turf and may require higher level management involvement. Most importantly you should NEVER even begin to think you will conquer all of the rogue data in the universe or that one solution fits all. In reality you'll be lucky to get it identified, and even luckier to tame a few of the wild rustlers.

STRATEGIC APPROACH TO ROGUE DATA MANAGEMENT
Strategically, you should identify how each set of data fits into your over all data architecture, this can help determine your approach to managing this data.

Some of this data is similar to data key business systems data, but for practical reasons has to be processed outside of these systems. For example, some system can't handle a certain "unique kind" of widget, loan, order, policy, part, security, etc., so these "unique kinds" are maintained in a spreadsheet or desktop database. The number of these "unique kind" of business data is usually small relative to the more "common kind" that can be processed by the business system. Even if the cost of modifying a key business system prohibits maintaining this data within the system it most likely belongs in, this data can usually be easily integrated with its counterpart with good interface design. (By the way, the primary reason key business systems can't handle "unique kinds" of business data is poor abstraction in the database design.)

Some of this data is often what is called "master data" or "reference data" such as lists of organizational hierarchies, operational hierarchies, product catalogs, suppliers, portfolio managers, etc. There are more systems and better quality systems that can now manage this type of data than ever before. This is especially true in the area of business intelligence or informational systems, where master or reference data management is not as difficult as it was in the past. Making this data available to everyone, and integrating it back into source systems can be difficult.

Some of this data is "intermediate" or what I refer to as "special case" data. This is data that requires highly specialized processing or manipulation by a business person with a unique skill set. It might be that this data is processed only quarterly, or only in response to some unique business change, or in a highly technical or specialized manner such that only customized software development would meet this processing need. There simply may not be systems that provide an affordable alternative to processing this type of data.

Strategically it is important that adequate controls be placed around the processing of all data, (rogue or otherwise), so that regulatory, security, disaster recovery, data quality, and other concerns are addressed.

Tactically it's hard to argue to a business group about the use of spreadsheets and desktop databases and other query tools. They're quick, easy, cheap, and as reliable as the users, which in my sample case were people who could perform certain types of higher math better than I could. Some of the reasons not to use spreadsheets and desktop databases to manage key business data include:

TACTICAL APPROACH TO ROGUE DATA MANAGEMENT
I consider tactics as the "how to get it done" part. I feel there are short term and long term solutions because in reality it takes time to execute a strategy. Some of the long term solutions include master or reference data management solutions, improved customer data management and a refined understanding of a what a customer or party is, enhanced requirements analysis and vendor requests, and simply better software development.

Some of these long term solutions are also short term solutions to the extent that they can be implemented sooner. For example, if one of the rogue data sets is such that there is a business application that processes similar data, but not for this "unique kind or type", then tactically you might work with the business and vendor to clearly define what data structure is needed so that the application can process this, and other future "unique kinds" as well.

If you're in a larger organization, you might be able to arrange for a development group that can develop application add-ons to address some of the needs for rogue data. This assures that some consistent and hopefully best practices are put into place. This must be done in a manner that assures the business needs are met in a rapid fashion, otherwise the business users are likely to demand their "application of choice".

Inevitably some of this data will remain managed through spreadsheets and/or desktop databases. For this data, you should define policies or procedures to use these applications to manage this data. This might include such things as:

CONCLUSION
It is important to have a clearly defined strategy, measurable objectives, and commitment in order to manage and reduce the amount of rogue data. It is also important to be a solution provider rather than the "revelator of remorse". By identifying your rogue data and determining, on a case by case basis, the best solution, you can help to "corral" your rogue data in a manner that fosters participation from your business users and creates a win-win situation.

  • For more information on data management, data governance and rogue data, listen to Data management podcast briefing: Data governance trends, with Gewn Thomas.

    Rate this Tip
    To rate tips, you must be a member of SearchDataManagement.com.
    Register now to start rating these tips. Log in if you are already a member.




    DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Data Management Integration - EDI, EAI, ETL, MDM, CDI, PIM
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 2005 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts