This content is part of the Essential Guide: Managing Hadoop projects: What you need to know to succeed
News Stay informed about the latest enterprise technology news and product updates.

Western Union connects the dots to enable enterprise Hadoop use

Building and running enterprise Hadoop applications takes more than data crunching. First, Hadoop data must be absorbed into company processes, a Western Union IT manager says.

Open source Hadoop and supporting data management tools created by Web giants like Google, Yahoo and Facebook give more traditional companies new ways to tap into mounds of data from the Internet and other sources. But in most cases, deploying an enterprise Hadoop system takes more than just massive data-crunching capabilities.

At money-transfer and payment services company Western Union, for example, incorporating Hadoop into its enterprise operations also meant successfully integrating large amounts of unstructured Web information into the corporate workflow.

"We can do things we couldn't do in the past because the data was so huge and getting answers was very difficult. Hadoop is helping," said Pravin Darbare, senior manager of data integration and data discovery at Western Union. But considerable work was required to make new forms of data readily usable by the Englewood, Colo., company's data scientists, business analysts and marketing workers, he said.

That included building out self-service business intelligence and analytics applications for the end users. And to feed those applications with the needed data, Darbare said, Western Union had to integrate a Hadoop cluster with a variety of other systems and software pieces.

Enterprise Hadoop integration snapshot

Customers can use Western Union's website to send money to other people, pay bills and buy or reload prepaid debit cards, or find the locations of agents for in-person services. Log-based clickstream data about user activity on the site is captured in unstructured non-relational formats, and then it gets mixed with relational data and moved along a data analysis pipeline, said Darbare, who called these data feeds funnels.

The data load going through such funnels is heavy: In 2013, more than 70 million users tapped into Western Union services online, in person or by phone, and the company said it averaged 29 transactions per second across 200 countries and territories.

And the load is likely to grow quickly as use of the Web and mobile devices continues to increase. The enterprise Hadoop system is helping to handle the data influx, and Western Union is relying on data integration software from Informatica Corp. to tie the cluster into its broader analytics architecture.

I don't think we can live without Hadoop. ... People are depending on the data.
Pravin DarbareWestern Union

First, the streams of unstructured and relational data are parsed and prepared for analysis using Informatica's Big Data Edition and Data Replication tools. The integration platform in turn connects with an IBM Netezza engine for structured data analytics and with the cluster, which is based on Cloudera Inc.'s Hadoop distribution, for storage and processing of both structured and unstructured data. This is all connected to Tibco Software Inc.'s ActiveSpaces in-memory data grid, and to analytics tools from SAS Institute Inc. and Tableau Software.

Forging a fast path to Hadoop

Using a commercial Hadoop distribution rather than basic Apache Hadoop is important to Western Union's effort. "We need support for the software," Darbare said, recalling the advent of the open source Linux operating system in the late 1990s. "This is just like many years back. We went through this with Linux -- it's the same scenario."

In addition, he said, Informatica's data integration tools helped Western Union get useful Hadoop applications up and running quickly. For example, Western Union's marketing teams are able to use the newly culled data to study website activity and to refashion the site in an effort to create a better user experience.

Darbare was more guarded in discussing uses of the enterprise Hadoop system by the data scientists at Western Union. However, there are clear indications that risk analysis and compliance with financial regulations are major drivers for advanced technology use at the company, which needs to guard against money laundering and other financial crimes.

While applications could be built quickly, Darbare said it took a year of effort to get to the point where his team could say the Hadoop data was being widely used. Now, though, the information in the cluster is crucial to the analytics process.

"Today, I don't think we can live without Hadoop," he said. "What we hear now is people that say, 'If Hadoop is down, I can't do my work.' This kind of comment is good for us. People are depending on the data."

Jack Vaughan is SearchDataManagement's news and site editor. Email him at, and follow us on Twitter: @sDataManagement.

Next Steps

Read top enterprise Hadoop stories

Learn about including Hadoop in existing processes

Watch a video on Hadoop use cases today

Dig Deeper on Hadoop framework

Join the conversation


Send me notifications when other members comment.

Please create a username to comment.

What has your experience building and running enterprise Hadoop applications been like?
My experience with Hadoop has revealed that it can become an alternative to the enterprise data warehouses. Hadoop is a highly scalable platform that allows me to store and distribute humongous amounts of data sets across numerous servers operating in parallel. It has been providing me with a cost effective storage solution. However, Hadoop applications take more than data crunching; they essentially require improved data governance. Hadoop data must be absorbed into the company processes.
Western Union's tremendous data volume has had a lot of insight for business decisions. With Hadoop being in the picture, there has been a lot of performance improvements and the way the data is being consumed. Truly amazing what hadoop can do and kudos to WU team.
Great insight into the strategy and execution details of Big Data(Structured and Unstructured data relationships), clustered Hadoop framework. You are trail blazers in leveraging the technology for the rigor of financial industry. Way to go BIG!!!
Big data is big dictum in IT world. It’s amazing to see how Western Union has incorporated Hadoop to integrate humongous unstructured data into meaning full data feed for analytics applications and BI.
There has been huge hype about BIG Data. It’s encouraging to see such success story.

Truly Splendid !
Excellent insight on Hadoop's increasing presence in corporate world and I share vision Mr. Darbare talked about, Hadoop is going to touch our lives in big way. Considering huge data volume at Western Union now I can easily resemble it to my company's structured/Un structured data model and think of impact it can make. Who knows data issues better than the person who spent years in extracting meaningful information from it, it's going to be great news for all those. Wonderful article.
If it requires as much processing power as Acrobat or Flash, it's going to require a huge hardware investment across the board for Western Union.
baby elephant is growing, getting bigger and bigger @WU. When we say Hadoop eco-system, it's no more restricted to typical hadoop components on the big data platform; rather the success lies in the efficient integration of various tools and technologies with hadoop to ensure smooth, flawless availability of sensible data for all sort of analytics supporting business needs. This is the main and core area where WU has really done a wonderful job. Kudos team WU!!