Today, high-level decision making can often start with machine-generated data in a lowly log file. As the Web has expanded, these quickly accumulating log files have become one of the most vital of the unstructured data types coursing through the enterprise. In the hands of different IT professionals, they can kick off a wide variety of data-intensive efforts.
The log files provide important clues about both machine performance and user experience. When correlated with other data, they might deliver insight into operations of all kinds: a streaming service monitors connectivity experiences, a medical device company feeds executive dashboards, or a radio network links audience activity to promotional campaign results.
People are finding more uses for log file data. As a result, organizations are paying more attention to software directly aimed at collecting and indexing this machine-generated data, which has been called "the original 'big data.'"
While big vendors such as IBM, SAS Institute Inc., Informatica Corp. and others offer tools for gathering log data, smaller vendors such as Loggly Inc., Sumo Logic Inc. and Splunk Inc. are seen as particularly focused on the need for log data apps. With a recent $229.5 million initial public offering on Wall Street, Splunk has gained attention far beyond its original audience of data center administrators. At the same time, customers are finding varied uses for the cloud software that Splunk describes as an "operational intelligence platform."
It is clear that this software is moving far beyond its roots in IT systems administration and management. However, the admin route is still how it often enters the organization.
For more on machine generated data
Read about Loggly's hosted tools for managing machine-generated data
Learn about more log data management tools
That is the case with Cricket Communications. Like other telecommunications providers, its business works better if it can reduce customer churn, quickly addressing -- or even forecasting -- issues that disrupt the user's experience with the network. Connecting operations' log activities to decision makers is important.
''The IT group was an early adopter of Splunk,'' said Ty Prinkki, senior operation manager at Cricket. After such use, Prinkki and his colleagues decided to more carefully cull the log data to begin to understand user sessions related to music downloads, a new service the company had begun to provide.
Some successful use of log activity can lead you to collect even more log data, he told an audience earlier this year at Gartner's ITxpo event in Orlando, Fla. ''We started with 200 GB per day; now we are at 2 TB per day,'' he said.
He said using logs leads the team to take a new approach to logging in general. ''In the future, we will pay new attention to what should be in there on the logging side,'' said Prinkki.
At Cricket, the log data collection efforts helped the company in prioritizing its software rollouts. ''As we gained insight into what was in our logs, we were able to get insight into the devices themselves,'' he said. This kind of data could be fed back to customer support, allowing staff to look at representations of specific sessions where customers had problems.
In some cases, log data application software may be used in ways that resemble complex event processing. Splunk has been used to intelligently instrument sensor networks in medical systems, to find customer preferences on a Web-based travel site and to detect fraud in banking.
Nomura Securities is another Splunk shop that started with log analysis but then expanded its uses, in this case to take part in what could be described as a high-powered form of business activity monitoring (BAM).
Nomura's world is a trading world where latency costs money, said Brian Ross, vice president of IT strategy and architecture. Splunk is used there as part of a system to create alerts on application errors and do root cause analysis, as well as to correlate events to system utilities and to a corporate asset database. ''We did not buy Splunk with this in mind at all,'' Ross said. Instead, the group discovered ''a litany of use cases.''
New possibilities for log data apps continue to appear. Some people see the log data application software tools as alternatives to Apache Hadoop tools for big data applications. Other people see complementary roles for log data software app and Hadoop analytical infrastructure software.
For its part, Splunk recently released Splunk Hadoop Connect for real-time collection, indexing and analysis of data. Once obtained, that data can then be forwarded to Hadoop for archiving and additional batch analytics.
This was first published in December 2012