Diana Hwang, a content development strategist at TechTarget, joined SearchBusinessAnalytics and SearchDataManagement editors Ed Burns and Jack Vaughan in this edition of the Talking Data podcast to discuss participation in a data journalism workshop sponsored by the New England Science Writers group. As part of an analytics project during the workshop, Hwang and Vaughan used software tools to pursue a deeper understanding of the data underlying the ongoing California drought.
While they didn't come away from the workshop with a drought story, they did leave with a greater respect for the amount of work that goes into preparing data for analysis. In the podcast, Hwang described their use of Excel, Python and iPython Panda analytics libraries, as well as CartoDB and GeoJSON, an online service that makes geographic data mapping more easily accessible to non-technical end users.
"We had to go to several state and [U.S.] government agencies to find the correct data. We layered U.S. Census data upon that as well," she said. "That took a bit of time."
Burns related his own recent experiences studying data analysis and the use of the open source R programming language for analytics. Some simple filters can be produced without too much effort, he said -- but, not surprisingly, creating more complex algorithms was a hurdle for the new R student.
Data journalism clearly has roots in traditional journalism. Facts, some painstakingly dug up by intrepid researchers, have always been at the heart of the best efforts of reporters. But the advent of open databases, many available on government websites, could betoken a new age of data-driven reportage.
In the podcast, Vaughan noted the great variety of data formats -- geographically oriented ones alone representing a rich treasure trove of information. But, he noted, that very richness can be somewhat daunting, as it may require the generalist reporter to become more and more of a data domain expert. The same kind of dynamic is playing out inside other types of organizations, where demand is growing for data scientists, data engineers and other workers with specialized skills who can help pull business value out of growing corporate data vaults.
Listen to the podcast to hear more about the potential -- and challenges -- of data analytics in general and data journalism in particular.
Find out how to apply caution in data-driven management
Understand why data preparation processes is coming under new scrutiny
Listen to a podcast on big data ethics
Review big data trends of 2014