Big data management
- February 25, 2020
Databricks Ingest entered a public preview in a move by Databricks to enable a lakehouse that combines the best features of the data lake and data warehouse models.
- January 14, 2020
Cloudera finds a new leader, pulling the former CEO of Hortonworks back into the fold to help set the direction for the big data Hadoop vendor as it moves forward in 2020.
- December 18, 2019
Cloud native data management isn't usually the first thing on anyone's mind, acknowledges Portworx co-founder and CTO Goutham Rao -- but find out how it's become part of organizations' journey.
- October 18, 2019
Databricks has found a new home at the Linux Foundation for its open source Delta Lake data lake project, in a bid to help grow a broader community and accelerate adoption.
- September 25, 2019
Cloudera released a big data platform combining its technologies and ones from Hortonworks, initially in the AWS cloud but with multi-cloud support to come.
- August 20, 2019
Self-service data preparation can duplicate work and slow down analytics. One possible fix: an internal marketplace where users can 'shop' for data assets.
- August 06, 2019
Longtime independent big data vendor MapR goes out of business, selling technology and intellectual property to HPE. The move marks the continuing decline of the Hadoop market.
- July 30, 2019
Hitachi Vantara's new Pentaho update brings DataOps capabilities for data management to help organizations derive better data insights.
- July 26, 2019
Inspired by the IBM-Red Hat model, Cloudera goes the open source route to broaden its market as demand for Hadoop weakens and the vendor takes on big competitors like AWS.
- June 28, 2019
EnterpriseDB is looking to push its database further with help from new financial backers. The deal sees Postgres originator Michael Stonebraker coming onboard as technical adviser.
- June 18, 2019
MongoDB released an S3-compatible data lake its developer legions can quickly query. But, word of MongoDB Atlas use on Google's cloud shows there are clouds to sow beyond AWS.
- May 31, 2019
It's right there in a MapR letter to California's labor department: A leader in the Hadoop market is desperately seeking funding after poor sales of its promising data platform.
- May 07, 2019
Blockchain is intriguing technology, but carries with it high system overhead. ProvenDB adds blockchain to MongoDB in an effort to gain acceptable performance.
- April 30, 2019
In this Q&A, now-former Snowflake CEO Bob Muglia discusses the vendor's decision to embrace cloud data warehousing and how the industry is changing as more data moves to the cloud.
- April 29, 2019
Teams at Wayfair mix new open source tools to power customer-facing apps. In such shops, tech leaders like Ben Clark must deftly maneuver an obstacle course of data components.
- April 17, 2019
New Google Cloud boss Thomas Kurian is putting databases and data management at the forefront at Google. The vendor has forged key data deals, showing a more mature Google Cloud.
- April 15, 2019
Events are as important as data in emerging applications underlying many e-commerce efforts. Streams of events tell a company what motivates customers to use online products.
- April 04, 2019
Tools such as Unravel and Pepperdata offer a way to measure performance of big data cloud applications, which may aid companies with on-premises configuration issues.
- March 25, 2019
Startups Interana and Rockset differ in their approaches to providing new query capabilities on fast-arriving big data. Both are led by technologists who started at Facebook.
- March 12, 2019
Apache Kafka and Apache Spark connectors ease use of the Aerospike NoSQL data store in high-speed applications such as analytics that are becoming more broadly supported.
- March 11, 2019
Data catalogs form a hub for managing enterprise data. New products focus on machine learning and AI add-ons that help automate aspects of data governance.
- February 14, 2019
The Presto engine arose as an alternative to Hive for big data queries. Now, the Presto Software Foundation has formed to promote the SQL query software's virtues.
- February 01, 2019
Cloud giants like AWS have adopted open source databases, causing Confluent, MongoDB and others to guard their assets the best way they know how: licensing.
- January 15, 2019
Two wunderkinds of Hadoop have formalized their merger. Cloudera and Hortonworks say they will place special focus on AI as they chart the stand-alone vendor's future.
- December 28, 2018
Better data governance, increased cloud use and wider DataOps adoption head the list of trends for data management teams to plan for in 2019, IT analysts say.
- December 14, 2018
Third-party vendors that offer data platforms to AWS users tout hedges against cloud lock-in. But they must both compete and collaborate with the cloud leader.
- October 12, 2018
MarkLogic rolled out a cloud-service version of its NoSQL database management system, a move designed to make the technology more cost-effective for cloud users.
- October 04, 2018
Hadoop users will have fewer choices as big data rivals Cloudera and Hortonworks unite. But the new company may be more competitive with AWS and Google.
- September 13, 2018
Hortonworks is joining with Red Hat and IBM to work together on a hybrid big data architecture format that will run using containers both in the cloud and on premises.
- August 08, 2018
Confluent Platform updates seek to bring data streaming with Apache Kafka to a wider audience. A new GUI and user-defined functions are part of the 5.0 release.
- July 19, 2018
Chief data officers and experts see the CDO role as changing to a more strategic orientation -- especially finding key opportunities in vast troves of data.
- July 16, 2018
The chief data officer role is about many things -- regulations, innovation, AI and more. Consultant Randy Bean discussed the matter ahead of an MIT symposium on the topic.
- June 27, 2018
NoSQL vendor MongoDB upgraded its database software with ACID support, while also releasing a serverless platform intended to simplify application development.
- June 25, 2018
Hortonworks users talk about building Hadoop data lakes to support new applications -- and the challenges their teams face on ingesting and refining data for end users.
- June 18, 2018
Hortonworks now supports Google Cloud Storage and has also broadened cloud deals with Microsoft and IBM, aiming to increase cloud uses of its big data platform.
- March 21, 2018
Big data vendors and users are looking to Kubernetes-managed containers to help accelerate deployments and enable more flexible use of computing resources.
- February 16, 2018
MongoDB is taking a deeper step into SQL-style processing waters with a 4.0 update that brings increased support for ACID-compliant transactions to its NoSQL database.
- January 03, 2018
Is this the post-Hadoop era? Not in the eyes of Hadoop 3.0 backers, who see the latest update to the big data framework succeeding in machine learning applications and cloud systems.
- October 05, 2017
At the Strata conference in New York, IT managers detailed steps they're taking to improve data quality in their big data environments in order to help ensure analytics accuracy.
- September 28, 2017
The Strata conference in New York saw big data platform vendor MapR Technologies update MapR-DB, its NoSQL database engine, to better perform in real-time analytics applications.
- August 30, 2017
In this Talking Data podcast, TechTarget editors discuss Hadoop's future, IBM's decision to resell the Hortonworks distribution of the open source technology and other big data issues.
- July 31, 2017
Data management startup Dremio has aimed its Apache Arrow expertise at the problem of self-service data delivery. In-column caches and optimization speed queries across varied data stores.
- July 24, 2017
In many organizations, chief data officer jobs centered on defense against risk are giving way to ones emphasizing innovation. To do so, CDOs must nurture a data culture, MIT panelists said.
- July 10, 2017
MongoDB targets better dashboard visualization with MongoDB Charts, which adds another means for business users trying to look into their NoSQL data pools.
- June 23, 2017
MongoDB has expanded cloud coverage for its Atlas hosted database service, with Azure and Google versions joining an initial AWS-based offering to give users a choice on cloud platforms.
- June 20, 2017
IBM pulled the plug on its distribution of Hadoop in favor of reselling Hortonworks' bundle of big data technologies, a decision that reduces the number of Hadoop vendors to four.
- May 31, 2017
Deep learning applications often require a mix of data, and assorted preprocessing techniques. That makes data preparation a priority, and conventional machine learning may have a role to play.
- May 12, 2017
Kafka is a linchpin in many on-premises big data pipelines. Now, software vendor Confluent is offering a Kafka cloud service to ease use of the messaging and data streaming system in the cloud.
- April 28, 2017
Data lakes offer a more expansive alternative to data warehouses for analytics uses. TDWI analyst Philip Russom offers advice on how to get things right in a data lake architecture.
- April 28, 2017
Systems of engagement represent a hotbed of activity in data management these days. Flexibility and scalability are watchwords.
- April 20, 2017
Corporate users are becoming more open to deploying big data systems with Apache Spark in the cloud, Databricks CEO Ali Ghodsi says in a Q&A on the open source processing platform.
- April 14, 2017
Software containers encapsulate complexity and ease deployment, two traits that are helping to elicit growing interest in using them as part of big data systems.
- March 31, 2017
Fitness company Beachbody set up a data lake system in the AWS cloud to support big data analytics applications after deciding that an on-premises deployment would be too complicated.
- March 08, 2017
Application profiling software from Pepperdata is built on LinkedIn's Dr. Elephant open source entry. A primary goal is to get more Hadoop and Spark applications into production.
- February 21, 2017
Moving custom Spark and Hadoop pilot projects into production use has proved daunting. But container technology eased the transition at the Advisory Board analytics service.
- February 06, 2017
Predictive models help Jewelry Television's on-air hosts sell its wares, thanks to data integration and preparation processes that funnel a mix of data into the analytics applications.
- February 06, 2017
Data scientists building predictive models and machine learning algorithms often have to do more data preparation work upfront than is necessary in conventional analytics applications.
- February 06, 2017
Increased automation of data pipelines and more flexibility for data scientists through self-service software are taking hold as big data deployments change data preparation practices.
- January 31, 2017
Big data analytics and digital transformation challenge the conventional data governance process, with many questions to answer in organizations. But a governed data lake shows how it can be done.
- December 02, 2016
Amazon's Athena data engine brings interactive SQL queries to S3 data sets and lets users pay as they go. It's based on an open source framework called Presto that Teradata and others also employ.
- November 30, 2016
The Louisiana Department of Health responded to flooding with the help of GIS software that located trouble spots with at-risk hospitals. Ease of use was welcome, according to a preparedness manager.
- October 28, 2016
Big data is moving from its bare-metal roots, and data streaming is a driver. Containers and microservices may have a role to play in what's next. An e-commerce application shows the way.
- October 06, 2016
What's in your toolbox? October's issue of Business Information turns the tables and puts that burning question to Capital One and several other business intelligence and data analytics software users. As the burgeoning worlds of ...
- September 30, 2016
Users increasingly are eyeing the cloud for big data management and analytics applications, and IT vendors are moving to ease the process -- and the price -- of running Hadoop in the cloud.
- September 15, 2016
HPE is paring down its software holdings, including analytical database software in the Vertica line and other big data tools. A sale to Micro Focus is due to close next year, leaving users in some limbo for now.
- August 31, 2016
Vertica 8.0 expands the analytical database's support for Kafka, Spark and Hadoop. That's an important step, as the Hewlett Packard Enterprise technology tries to compete in a field of diverse data tools.
- August 30, 2016
Cloud data warehouse offerings from smaller vendors seek to address functionality gaps that bigger players may miss. Newcomer Snowflake Computing targets concurrent queries, for example.
- August 23, 2016
Managed data services are growing in use, as types of data stores proliferate and the cloud becomes the home for more data. DevOps is a driver behind the changes, which bring new duties and needed skills for DBAs.
- July 06, 2016
Hadoop management is becoming a bigger priority for big data users and vendors alike as the distributed processing framework plays a more central role in the business operations of organizations.
- June 24, 2016
On the occasion of ComputerWeekly's 50th anniversary, Brian McKenna joins the Talking Data podcast crew to look back at Bletchley Park, and forward to Hadoop and AI.
- June 17, 2016
Graph technology is popping up in many places, including master data management. A major data integration player has joined the quest, as seen recently at Informatica World.
- June 01, 2016
Data wrapping -- in this case, bundling data and analytics services with products -- may entice more companies to become data businesses. A panel at an MIT symposium considered some best practices for doing so.
- April 29, 2016
Open source data engineering has become a way of life at e-commerce leader eBay, says the company's Debashis Saha. Kylin is one of the tools that has resulted.
- April 22, 2016
A new view on hybrid data architectures, in which data lakes and warehouses coexist, emerged at EDW 2016. The hybrid approach has implications for data design, skills and planning.
- April 19, 2016
Running a Hadoop cluster in the data center isn't for the weak. But several new tools aim to give IT operations teams a closer look into what's going on inside Hadoop-based big data systems.
- April 13, 2016
Pivotal Software dropped out of the Hadoop distribution business in favor of reselling the Hortonworks version of the big data framework -- and the market consolidation moves may not be over.
- April 01, 2016
Moving streams of data is a must in many modern applications. As a result, streaming analytics systems with Spark Streaming, Kafka and other components are coming to the big data forefront.
- March 31, 2016
At Strata + Hadoop World 2016, Hadoop co-creator Doug Cutting said the core of the distributed processing framework is likely to see its position at the center of big data systems diminish.
- March 24, 2016
The Strata + Hadoop World conference focuses on big data management and analytics technologies, in particular the Hadoop distributed processing framework and Spark processing engine.
- March 02, 2016
Looking to better balance system stability and innovation, Hadoop distribution provider Hortonworks will follow two release 'cadences' for different component sets in its HDP package.
- February 04, 2016
Hadoop has been slowly plodding through the big data jungle, but SQL's integration may put a spring in the elephant's step.
- January 28, 2016
In a Q&A as Hadoop reaches one 10-year milestone in its development, co-creator Doug Cutting talks about the adoption of the big data framework, and the history and future of Hadoop.
- January 27, 2016
Software architect Mansour Raad is at the center of activity as geospatial data melds with Hadoop -- and soon, Spark.
- December 29, 2015
The increasing adoption of self-service business intelligence tools and big data analytics applications is complicating data governance programs, BI Leadership Summit speakers and attendees said.
- December 21, 2015
This episode of the 'Talking Data' podcast looks at the word of the year in data analytics and management. In 2015, Spark joined Hadoop and MapReduce at the top of the list of trending big data technologies.
- December 21, 2015
In a Q&A, big data and data science expert Kirk Borne discusses new data processing and analytics technologies and the growing importance of data literacy in organizations.
- November 16, 2015
IBM's planned purchase of The Weather Co.'s data operations may be a bellwether event from which data professionals can learn.
- October 30, 2015
At its Insight 2015 conference, IBM featured Apache Spark, releasing a cloud-based Spark service to support analytics applications and detailing Spark use in some of its own tools.
- October 27, 2015
Dell and others have a new ETL reference architecture. Its purpose is to ease migrations to Cloudera Hadoop. Also: Dell buys EMC; Syncsort is acquired.
- October 19, 2015
We may have outlived the era of killer apps in some part defined by Walmart, but Hadoop big data applications may help the giant's quest for more growth.
- October 13, 2015
MapR takes JSON format data into Hadoop, while Teradata places its flagship database on AWS.
- October 07, 2015
Tracking 'What is Hadoop?' is getting more complex as the potential components of Hadoop systems increase -- and core elements such as HDFS are augmented by possible alternatives.
- September 30, 2015
The latest version of MongoDB finds the NoSQL database running on a new WiredTiger storage engine. Better performance and data compression are among MongoDB 3.0's touted benefits.
- September 30, 2015
The DataStax Cassandra engine, officially called DataStax Enterprise, is now Spark-certified. The move is one of several for the NoSQL database on a possible upswing, further evidenced by a new deal with Microsoft.
- September 17, 2015
The latest episode of BizApps Today examines barriers to Hadoop technology adoption, SQL-on-Hadoop options and the new concept of data storytelling.
- September 09, 2015
In a Q&A, Clarity Solution Group CTO Tripp Smith says to base SQL-on-Hadoop software decisions on actual workloads. Some Hadoop tools target batch jobs, while others are intended for interactive ones.
- August 13, 2015
In a Q&A, data warehousing expert Joe Caserta explains why a new generation of developers building Hadoop clusters and other big data systems may need an introduction to some fundamental rules of ETL.
- August 07, 2015
RelayHealth's Raheem Daya described the path he took to deploy and expand a Hadoop cluster for distributed data processing during a presentation at the 2015 TDWI conference in Boston.
- July 31, 2015
Database pioneer Michael Stonebraker told an MIT conference that data curation tools offer a step forward on data integration. Others said things are already progressing through other means.
- July 27, 2015
Metanautix Quest 2.0 is an upgraded query engine and data integration platform. Its designers helped create Google Dremel, a precursor for new big data query tools.