Guide to NoSQL databases: How they can help users meet big data needs
A comprehensive collection of articles, videos and more, hand-picked by our editors
Much of the recent attention for in-memory database technology has focused on ERP leader SAP's Hana engine -- but there are some interesting in-memory innovations percolating elsewhere too. Among those is software from Aerospike Inc., which is based in Mountain View, Calif. The Aerospike software couples a key-value NoSQL data architecture with a hybrid (DRAM and flash drive) in-memory scheme for fast operational transactions.
The Aerospike software allows a data analytics house to drive near-real-time (millisecond) marketing decisions based on Web user data. Elad Efraim, chief technology officer and co-founder of New York-based eXelate, said the Aerospike NoSQL database allows the firm to index in DRAM and place data in Flash memory on clusters. This reduces both server count and transaction latency. He estimated that the database could support its customers' one trillion real-time data transactions monthly.
Aerospike's Monica Pal, chief marketing officer, said her company's approach to NoSQL acts as an alternative to the combination of data caching and relational databases that has seen great use in distributed systems in recent years. She said the rise of e-commerce and the drive for user personalization is stressing the established caching infrastructures.
For its part, analyst group Forrester Research earlier this month estimated that more than 50% of enterprises will be using in-memory databases by 2017. That projection came via Forrester's TechRadar: Enterprise DBMS report for the first quarter of 2014. The report also said that today's NoSQL key-value store databases' ability to support millions of users positions it for greater use by "Web scale companies."
EnterpriseDB links to AWS virtual private cloud services
Today, established relational databases are under attack not only from NoSQL upstarts, they vie with lower-cost open source relational database alternatives as well. This occurs as database vendors of all ilk push to better position their offerings for the move to cloud computing. [Just this week, IBM moved to improve its position on cloud data, buying NoSQL Data as a Service startup Cloudant.]
Among recent players tuning up for the cloud is Bedford, Mass.-based EnterpriseDB Corp., which provides Postgres database products. A primary goal of Postgres is to undercut established RDBMS pricing. That idea of low cost may become even more compelling to users as they experiment with cloud-based data infrastructures that promise to reduce maintenance and staffing cost.
Last month, EnterpriseDB expanded its Postgres Plus Cloud Database to support Amazon Web Services' Virtual Private Cloud feature. That allows easier switching from on-premises private clouds to Amazon public cloud implementations. Also, EnterpriseDB supports a usage-based model for its cloud database pricing.
"Cloud is a really important shift. It brings infrastructure changes," said Ed Boyajian, CEO at EnterpriseDB. "People are thinking about infrastructure in a different way."
Postgres Plus Cloud Database can support both community PostgreSQL and EDB's Postgres Plus Advanced Server, he said, noting that the software scales automatically based on processing requirements, server capacity or network capacity.
Mega yardstick: NIST to measure big data
Over the years, the National Institute of Standards and Technology has tested or calibrated everything from atomic clocks and butcher scales to electron microscopes and voting machines. So it is not too surprising that the government-backed lab is looking into big data -- one of the hardest-to-measure phenomena to recently arrive. Next week (March 4 to March 6), at its campus in Gaithersburg, Md., the group will stage its first NIST Data Science Symposium to discuss relevant measurement techniques for data science.
A lot of the emphasis will be on the science side of data science, according to Ashit Talukder, chief of NIST's information access division. More research is needed to better understand how to measure, characterize and evaluate the performance and uncertainty of end-to-end systems that combine multiple analytic technologies, heterogeneous data modalities, streaming sensor data analytics, and workflows and systems where humans are part of the analytic system, he said via email.