Graph databases have emerged as yet another way to connect data points -- but graph processing requirements for...
big data have sometimes kept them out of the reach of real-time, operational analytics.
Led by former Teradata Hadoop engineer Yu Xu, TigerGraph came out of stealth in 2017, claiming such notable users as Visa and the online payment platform Alipay for its graph database technology. The software has been used by these players, for example, to speed up credit checks and other traditionally time-consuming financial processes.
According to TigerGraph, its database supports a Massively Parallel Processing architecture in which graph nodes -- the company uses the less common term "vertices" -- exhibit both compute and storage features. The vendor also employs a parallel loader to speed data ingestion and has fashioned a GSQL analytics language to produce parallel graph queries.
IceKredit Inc. has found those features useful in its efforts to expand the availability of credit ratings and risk assessments, according to Minqi Xie, vice president and director of modeling and business intelligence at the financial technology company, which was founded in Shanghai and has a U.S. office in Westlake Village, Calif.
"We have very large data sets with hundreds of millions of [nodes], and we need to mine the relationships at depth," said Xie, who works to ensure that IceKredit provides useful online credit ratings and risk monitoring services for companies and individuals.
Xie said IceKredit uses graph analytics to uncover connections within data sets that can identify patterns of risk. Such patterns must be uncovered more and more quickly in the fast-moving fintech space, where credit approvals that once took a week are now accomplished in minutes.
"TigerGraph made it feasible for us to leverage the features from relationship networks for real-time scoring," he said.
Graph processing cuts through data molasses
Like others, Gaurav Deshpande, a longtime IBM data hand who earlier this year became vice president of marketing at TigerGraph, points to graph processing as superior to relational database schemes when it comes to representing interconnected relationships between data points.
Philip Howardanalyst, Bloor Research
But how quickly these connections can be analyzed can be an obstacle, Deshpande claimed.
Traditionally, graph databases "weren't 'operational' when it came to data volumes beyond 50 to 100 GBs," he said. "Performance slowed down. When you got to more than 500 GBs, performance was like molasses."
Such systems were useful for visualizing data, but "they were nothing that could be used operationally," Deshpande said. The parallelizing techniques that TigerGraph applies to its graph database, he indicated, are intended to bring graph processing technology closer to operational efficiency with large data sets.
Complex analytics at scale
TigerGraph is looking to join the still fairly limited ranks of graph database makers that employ a number of diverse approaches to implementing the NoSQL technology. These include vendors such as Neo4j, Franz and Cambridge Semantics. Among others are AWS and Microsoft, which have taken steps to bring graph processing technology more mainstream with, respectively, Amazon Neptune and Azure Cosmos DB cloud offerings.
"Historically, almost all the graph database vendors have focused on operational capabilities. They support a certain level of analytics and query processing but not complex analytics at scale," said Philip Howard, an analyst at Bloor Research.
"TigerGraph, on the other hand, is specifically targeted at precisely those sorts of use cases, so you can't really compare it with most of the other suppliers," he said.
There are some precedents, however, and signs of change. Bloor noted that supercomputer maker Cray offers graph analytics capabilities as part of its product portfolio. And he sees others ready to follow suit.