Big data applications: Real-world strategies for managing big data
A comprehensive collection of articles, videos and more, hand-picked by our editors
When adMarketplace lost its eBay backing, the search syndication and advertising platform provider turned to "big data" to fill the sizable void and, after a tumultuous period of rebuilding, eventually managed to extend its reach to the Internet at large.
AdMarketplace provides search syndication -- the routing of potential customers to relevant ads based on site searches -- to a healthy stock of advertisers. Its syndication offerings, which the company's robust system of big data machines supports, have landed it some business giants for clients, including Sony, Walmart and GM. Its big data tactics have also secured the company's place as a semifinalist in this year's Constellation SuperNova awards, which honor organizations that take innovative approaches to meeting customer needs.
The search advertising company processes billions of ad requests daily in near real-time using a traditional pay-per-click platform with a twist. Advertisers bid on keywords and pay each time a user clicks through. But unlike major competitors, it also takes the source of traffic into account, providing custom pricing based on the relevancy of the clicked-from site and the consequent likelihood of profit from the ad view.
Fall from grace leads to new big data strategy
AdMarketplace launched in 2000, and its approach to search advertising soon caught eBay's attention; in 2003, it was chosen as the exclusive pay-per-click management platform for the online auctioneer.
Michael YudinCTO, adMarketplace
The partnership came to an abrupt end three years later, however, when eBay made a deal with Yahoo that effectively terminated adMarketplace's exclusivity. That's around the time when adMarketplace's current chief technology officer, Michael Yudin, stepped up to the plate.
"It was probably the hardest year in my life when I started here," Yudin said. "We lost a major source of revenue. We had financial problems. We had to build this new platform while staying in business."
And build they did. The company dug its way out of negative net worth a year after the separation, and they did it without outside help.
"It was completely bootstrap growth all the way," said Yudin, who now considers the break from eBay "a big blessing in disguise." The split ultimately inspired the company to construct an independent syndication platform, and on a much vaster scale.
"EBay is good, and it's big, but the Internet is much, much bigger," Yudin said.
As the company began reconstruction, it formed an impressive network of publishers to push customer reach past mere search engine result pages. Whereas Yudin sees industry frontrunners Google and Yahoo treating their syndicated networks as "a one-size-fits-all extension of their own site search activity," adMarketplace uses a range of metrics to provide unique insights to advertisers.
"Our platform has been specifically built for managing this complexity that these companies have not solved," Yudin said. "Our system gives you data, gives you controls [and] differentiated pricing based on performance of your ads across all these sources."
Dealing with the big data onslaught
With all this customization, adMarketplace processes a lot of data -- 100 gigabytes an hour, to be exact.
"Big data is really the life blood of our system," Yudin said. "It touches every component of our platform."
But managing all this information is no simple feat. After the split from eBay, adMarketplace found that it needed another database to deal with the growing volume of information and to supplement its in-house systems, including an HP Vertica data warehouse appliance.
Latency and uptime considerations were of utmost importance when it came to selecting the right database vendor.
"If you don't return ads fast enough, you might as well not even be in this business, because no one's going to wait for you," Yudin said.
Yudin and his team began the search by looking at NoSQL database providers, weeding through the open-source market and unearthing two standouts: MongoDB and Citrusleaf, a company founded in 2009 that has since changed its name to Aerospike.
While MongoDB's software itself was free, the product was weighed down with management and support costs. Ultimately, the "well-known" and "good" company just wasn't capable of achieving the kind of scale adMarketplace was looking for, Yudin explained.
That left Aerospike, but the company had to woo adMarketplace for over a year before it agreed to an implementation.
"I kept getting sales calls from them every week," said Yudin, who first met with Aerospike's CEO and CTO soon after the company's initial launch as CitrusLeaf.
The wait gave Aerospike time to mature, according to Yudin, who described the company as still "very new" and "cutting edge." The database has been in production for six months now, and things are running smoothly, though Yudin thinks the company has room to grow.
Aerospike's speed may inhibit its functionality, which Yudin described as "very simple." Aggregated queries, such as pinpointing the total number of customers interested in a specific keyword over the course of a day, are not possible. "If they can add this while continuing to be fast, reliable and scalable, that would be good," Yudin said.
The Aerospike database harbors data that is accessible in real time, and it is backed by a system that can fulfill queries, and more. Yudin swears by adMarketplace's combination SQL/NoSQL strategy. Despite deeming traditional SQL databases too slow for real-time data processing, Yudin believes they have a fundamental place alongside NoSQL in an effective big data management strategy.
"You need to have both NoSQL and SQL components of your data engine well-tuned," Yudin said. "I don't think that you'll really be able to have a well-functioning product [without] both of them."
AdMarketplace's platform comprises hundreds of servers in several data centers. The servers work to process numerous events, or data points, into log files, Yudin explained. Then, a distributed engine created in-house aggregates the information, and dozens of terabytes of meaningful data are stored in the SQL-based warehouse. This ultimately allows advertisers to see pertinent information, such as where ads were shown, how many clicks they garnered and the total cost for the service.
Big launch for big data
AdMarketplace continues to make new strides in harnessing its massive data store with the recent release of a new product, Advertiser 3D.
"It's the only ad platform that gives the user access to thoroughly big data in a very elegant and efficient way," Yudin said.
Features include the ability to see "enormously detailed data" related to the number of clicks an ad receives, the cost of clicks and cost per action. This enables advertisers to place informed bids on keywords to bolster traffic. The platform also boasts apps the company's algorithmic engine fuels that offer suggestions on how to improve performance and increase traffic.
Despite this array of complex data-wrangling technology, adMarketplace's goal remains simple. "We're in the business of helping good companies make more money, acquire more customers and grow their businesses online," Yudin said.
For other organizations hoping to build or rebuild, Yudin has some hard-earned advice. After seeing many businesses throw time and money into perfecting products that they haven't tested in the marketplace, he's learned "the world does not work this way." Instead, Yudin offers a more daring approach.
"You throw something out there, you see what works, [then] you improve one customer at a time," he said. "Don't be afraid to fall. [Stay] in the game, and that's how you're going to get better."
Find out if your organization should invest in a new big data strategy
Read up on other companies' recent big data initiatives
Learn how the big data craze is affecting traditional data warehouses