PRO+ Premium Content/Business Information

Thank you for joining!
Access your Pro+ Content below.
June 2017, Vol. 5, No. 3

Data lake management, governance a hands-on job for big data teams

Data lakes evoke images of vast pools of raw data, available for unfettered exploration and analysis. But the reality isn't so free and easy: To avoid information chaos, all that data needs to be cataloged and governed -- and doing so is still a developing and very often do-it-yourself process. As a result, data lake management and governance frameworks are in the formative stage in many organizations, with IT and data management teams scrambling to piece together combinations of governance tools and mechanisms to help keep their big data environments in order. That's the case at medical insurer Health Care Service Corp. (HCSC), which deployed a Hadoop data lake in April 2016 to give its data scientists and other analysts self-service capabilities for analyzing data from source systems across the Chicago-based company's operations. But self-service doesn't mean a free-for-all in the Hadoop cluster, explained Susan Swanson, senior manager of data modeling and architecture at HCSC. "We need something that's governed and controlled...

Features in this issue

News in this issue

Columns in this issue