photobank.kiev.ua - Fotolia
Amazon Web Services dove into the graph technology waters in late November with its Amazon Neptune database. Like almost anything else from today's undisputed cloud computing leader, Neptune immediately gained wide attention. But existing graph database vendors think that attention could be a good thing for them.
In the graph market, AWS will encounter more than a few competitors. Besides large relational database players -- both IBM and Microsoft launched new graph databases based on open source technologies this year -- Amazon joins a group of smaller graph technology purveyors that could use a hand in evangelizing the technology.
Graph databases eschew the traditional column- and row-based relational data format, employing instead a format based on nodes -- or entities -- and edges that represent connections between relationships. Fraud detection, master data management and recommendation engines are areas seeing more graph use cases.
Also, graphing is closely associated with semantic technology, which is intended to uncover meaning in text and other data types. Notably, Neptune can handle the two main streams of graph data approaches, as it supports both property models and RDF mapping models for data.
Welcome to the funhouse
Although they are widely viewed as startups, some of the graph technology players have been in the game for quite some time. Early and late runners alike have seemed to welcome Amazon Neptune to the market, with the occasionally expected caveat.
"This newer technology will see a better-educated audience," said Yu Xu, founder and CEO of TigerGraph, a graph database maker that recently arose from stealth mode with a system centered on a parallel graph computation engine. Clearly, Xu hopes Amazon's graph entry could speed up enterprise graph database adoption in general by informing more people about it.
"We don't really see that we are competing with them," said Sean Martin, CTO at Cambridge Semantics, which last year added an in-memory massively parallel processing (MPP) graph database engine to its existing semantic technology software line. "We like the fact that their system will generate data in the right form for us to do analytics on it."
Amazon's Neptune move is a clear endorsement of graph concepts, according to Emil Eifrem, CEO of Neo4j Inc., which has been supporting graph database development for over 10 years.
"Now it seems every enterprise software giant has a graph offering," he said. "The only one that has been missing has been Amazon. It is a massive validation."
Eifrem said his company's Neo4j data platform, which added a connector to the Apache Spark parallel analytics engine in a recent release, will continue to emphasize the use of a Cypher declarative programming language that he likens to SQL in its capabilities.
One of the earliest NoSQL database vendors, MarkLogic Corp., formally joined the graph movement in 2013 when it introduced SPARQL querying and RDF support for its MarkLogic Server software. Amazon Neptune's support of SPARQL and RDF will increase awareness of such semantic software, according to Joe Pasqua, executive vice president at MarkLogic. He did suggest, however, that Amazon's initial implementation "may not sit well with enterprise CIOs," who must work with sensitive and regulated data.
Discovering use cases
As unstructured data has proliferated, and as data scientists have worked to discover new relationships within databases, graph approaches have vied for consideration. There are signs of greater graph technology interest in enterprises these days, according to Gartner analyst Nick Heudecker.
"People are discovering use cases that are increasingly being addressed with graph technology," he said. "That includes things like social network analysis and anomaly detection."
However, because of the varied nature of the use cases, discovering graph databases can be something of a journey, Heudecker said.
"It's an interesting challenge to figure out when a graph is a good solution. No one says 'I have a graph problem,'" he said. "But they do find, when they identify that they have, for example, a logistics challenge, that graphs can help."
A chorus of cohorts
While relational databases can perform a type of graph processing, the dedicated graph platforms that have emerged may serve better when users process larger graphs, as is becoming the case in many analytics jobs, Heudecker said. He cites the large data cohort groups of machine learning data as an apt graph use case.
Graph databases have the potential to play a role in new AI applications, said David Schubmehl, an analyst at IT market research company IDC.
"We see graph databases as part of an effective AI application," he said, noting that what he calls knowledge graphs are now included in IDC's list of AI technologies to watch.
For Amazon, Schubmehl said, the move to graph data is part of a larger objective.
"Amazon is trying to facilitate the performance and the running of machine learning models on their hardware and cloud infrastructure," he said. "Their recent announcements say that they are going to provide the tools and the infrastructure needed for that."
Few underestimate Amazon's potential impact as more data-oriented work moves to the cloud. But Amazon has had increasing influence in recent years with a diverse range of new cloud databases.
Amazon Neptune joins the Aurora relational database, Athena query engine, DynamoDB NoSQL database, Redshift MPP data warehouse, Simple Storage Service data storage and other elements in the Amazon Web Services portfolio. Amazon also recently added SageMaker machine learning developer tools.
While the existing graph technology players can benefit as Amazon helps shed greater light on graphs, they will need to be nimble, too. The challenge Amazon presents to them is less about Neptune, and more about the Amazon offerings overall. Put another way, it is about general, wide-scale integration versus special, best-of-breed software. For the graph technology players, such challenges are already quite familiar.