News
News
- June 09, 2016
09 Jun'16
Databricks reveals more on Spark 2.0; IBM, Microsoft renew Spark push
Spark 2.0, with structured streaming and SQL 2003 support, is aborning as indicated at Databricks' Spark Summit, where R-to-Spark interfaces also popped up.
- June 03, 2016
03 Jun'16
Kafka streaming gets a new twist
Startup vendor Confluent is looking to place a stake in the big data ecosystem with Kafka streaming and management tools meant to reduce complexity in applications that place data in motion.
- June 01, 2016
01 Jun'16
MIT panel shares tips on how to get into the data business
Data wrapping -- in this case, bundling data and analytics services with products -- may entice more companies to become data businesses. A panel at an MIT symposium considered some best practices for doing so.
-
- May 19, 2016
19 May'16
Oracle NoSQL: An oxymoron waiting to happen?
New cloud apps seem ready-made for NoSQL. This may cause Oracle to put more focus on its Oracle NoSQL database, which is often overlooked amid a crush of NoSQL contenders.
- May 16, 2016
16 May'16
New data landscape augurs discovery-based architectures
In an interview, consultant Lakshmi Randall foresees changes in how data management is organized and executed as the overall data landscape shifts due to the adoption of big data systems.
- April 29, 2016
29 Apr'16
Big data challenges traditional data modeling techniques
Surging big data is changing data modeling techniques, including schema creation. The word from Enterprise Data World 2016: Data pros must adjust.
- April 29, 2016
29 Apr'16
EBay helps drive new style of data engineering
Open source data engineering has become a way of life at e-commerce leader eBay, says the company's Debashis Saha. Kylin is one of the tools that has resulted.
- April 22, 2016
22 Apr'16
Data lake meets warehouse in hybrid data architectures
A new view on hybrid data architectures, in which data lakes and warehouses coexist, emerged at EDW 2016. The hybrid approach has implications for data design, skills and planning.
- April 19, 2016
19 Apr'16
New tools offer a better view into managing Hadoop clusters
Running a Hadoop cluster in the data center isn't for the weak. But several new tools aim to give IT operations teams a closer look into what's going on inside Hadoop-based big data systems.
- April 13, 2016
13 Apr'16
Hadoop market consolidation continues with Pivotal's exit
Pivotal Software dropped out of the Hadoop distribution business in favor of reselling the Hortonworks version of the big data framework -- and the market consolidation moves may not be over.
-
- April 01, 2016
01 Apr'16
Streaming analytics puts data in motion at Strata + Hadoop 2016
Moving streams of data is a must in many modern applications. As a result, streaming analytics systems with Spark Streaming, Kafka and other components are coming to the big data forefront.
- March 31, 2016
31 Mar'16
Hadoop core components may not be central to big data future
At Strata + Hadoop World 2016, Hadoop co-creator Doug Cutting said the core of the distributed processing framework is likely to see its position at the center of big data systems diminish.
- March 25, 2016
25 Mar'16
Unstructured data is a misnomer
Nowadays, the term unstructured data pops up everywhere. It owes its popularity for a large part to the success of big data, to successful technologies such as NoSQL and Hadoop, and to formats such ...
- March 24, 2016
24 Mar'16
Strata + Hadoop World 2016: Hadoop and Spark in spotlight
The Strata + Hadoop World conference focuses on big data management and analytics technologies, in particular the Hadoop distributed processing framework and Spark processing engine.
- March 16, 2016
16 Mar'16
Social media startup uses NoSQL Redis Cloud to 'scale to infinity'
Because of growing data demands, and the need to nimbly scale up and down, a startup social networking platform chose a Redis Labs NoSQL database management system running on AWS.
- March 02, 2016
02 Mar'16
Hortonworks Hadoop distribution goes to two release tracks
Looking to better balance system stability and innovation, Hadoop distribution provider Hortonworks will follow two release 'cadences' for different component sets in its HDP package.
- February 29, 2016
29 Feb'16
Apache Spark architecture speeds data jobs, ousts MapReduce
Its collection of big-data processing features is priming the Apache Spark architecture for wider deployment. One key trait: Spark performance outpaces MapReduce in many Hadoop use cases.
- February 24, 2016
24 Feb'16
AtScale Benchmarks Three SQL-on-Hadoop Engines
Numerous SQL-on-Hadoop engines are available for accessing data stored in HDFS using the familiar SQL language. They all look promising, they all support a rich SQL dialect, but which ones is the ...
- February 24, 2016
24 Feb'16
Spark Streaming update to address growing torrent of big data
Amid the buzz at Spark Summit East 2016 in New York was word that the Spark data processing engine's stream processing architecture will be overhauled in the upcoming version 2.0 of the open source software.
- February 04, 2016
04 Feb'16
HR metrics and analytics bring opportunity, concerns
HR personnel oversee many of the perks that come with a job, such as paychecks and benefits; the elements that most people would rather avoid, such as firings and employee conflicts; and all things in between, such as education and training. Now, ...
- February 04, 2016
04 Feb'16
SQL-on-Hadoop platforms open brave new analytics world
Hadoop has been slowly plodding through the big data jungle, but SQL's integration may put a spring in the elephant's step.
- February 03, 2016
03 Feb'16
NoSQL revs up to the tune of the Spark connector
Attention's been placed on Spark running on Hadoop, but there are Spark connectors for NoSQL that usher in a new class of operational analytics.
- January 29, 2016
29 Jan'16
Data quality metrics a governance must, managers say
Data governance managers who spoke during an online conference said that tracking business-oriented metrics on data quality improvement is a key to success in governance efforts.
- January 28, 2016
28 Jan'16
Co-creator Cutting assesses Hadoop future, present and past
In a Q&A as Hadoop reaches one 10-year milestone in its development, co-creator Doug Cutting talks about the adoption of the big data framework, and the history and future of Hadoop.
- January 27, 2016
27 Jan'16
Geospatial data is on the map for Hadoop, Spark
Software architect Mansour Raad is at the center of activity as geospatial data melds with Hadoop -- and soon, Spark.
- January 14, 2016
14 Jan'16
MapR adds 'Streams' messaging to its Hadoop data pipeline
MapR's Hadoop distribution will add a message system to feed a streaming data pipeline. It takes a cue from open-source Kafka technology.
- December 29, 2015
29 Dec'15
Data governance process taxed by self-service BI, big data
The increasing adoption of self-service business intelligence tools and big data analytics applications is complicating data governance programs, BI Leadership Summit speakers and attendees said.
- December 28, 2015
28 Dec'15
IBM Watson APIs hold key to broader cognitive computing use
In 2015, APIs for IBM's Watson system were front and center as a means to bring cognitive computing applications to a broader corporate audience.
- December 21, 2015
21 Dec'15
Spark: It's the word of the year in 2015 for data analytics
This episode of the 'Talking Data' podcast looks at the word of the year in data analytics and management. In 2015, Spark joined Hadoop and MapReduce at the top of the list of trending big data technologies.
- December 21, 2015
21 Dec'15
Kirk Borne on data science and big data analytics, data literacy
In a Q&A, big data and data science expert Kirk Borne discusses new data processing and analytics technologies and the growing importance of data literacy in organizations.
- December 09, 2015
09 Dec'15
Concurrent app management tools work on Hadoop and Spark
If Hadoop and Spark are to sneak into the enterprise, they will need to be manageable. With Driven, Concurrent Inc. takes a stab at the problem.
- December 02, 2015
02 Dec'15
Analysts and Data Scientists Need SQL-on-Everything
Business analysts and data scientists no longer restrict themselves to internally produced data that comes from IT-managed production systems. For their analysis they use all the data they can lay ...
- November 16, 2015
16 Nov'15
The Weather Co. buy boosts IBM big data analytics push
IBM's planned purchase of The Weather Co.'s data operations may be a bellwether event from which data professionals can learn.
- October 30, 2015
30 Oct'15
IBM goes all-in on Spark analytics with new service, apps
At its Insight 2015 conference, IBM featured Apache Spark, releasing a cloud-based Spark service to support analytics applications and detailing Spark use in some of its own tools.
- October 27, 2015
27 Oct'15
Dell links with Syncsort to tune Cloudera Hadoop for offloaded ETL
Dell and others have a new ETL reference architecture. Its purpose is to ease migrations to Cloudera Hadoop. Also: Dell buys EMC; Syncsort is acquired.
- October 19, 2015
19 Oct'15
Big data applications to drive Walmart reboot
We may have outlived the era of killer apps in some part defined by Walmart, but Hadoop big data applications may help the giant's quest for more growth.
- October 13, 2015
13 Oct'15
MapR does JSON format; Teradata goes on AWS
MapR takes JSON format data into Hadoop, while Teradata places its flagship database on AWS.
- October 12, 2015
12 Oct'15
Big Data Myth 4: Big Data is Unstructured Data
Not so long ago I attended a session in which the speaker was very clear on what big data is and what it is not. In his opinion, big data is unstructured data and unstructured data is big data. ...
- October 07, 2015
07 Oct'15
Core Hadoop components get more company as vendors diversify
Tracking 'What is Hadoop?' is getting more complex as the potential components of Hadoop systems increase -- and core elements such as HDFS are augmented by possible alternatives.
- October 05, 2015
05 Oct'15
Big Data Myth 3: Big Data is Too Big for SQL
The third big data myth in this series deals with how big data is defined by some. Some state that big data is data that is too big for a relational database, and with that, they undoubtedly mean a ...
- September 30, 2015
30 Sep'15
MongoDB 3.0 rides WiredTiger engine to boost database speeds
The latest version of MongoDB finds the NoSQL database running on a new WiredTiger storage engine. Better performance and data compression are among MongoDB 3.0's touted benefits.
- September 30, 2015
30 Sep'15
DataStax Cassandra moves ahead with Azure deal and graph store
The DataStax Cassandra engine, officially called DataStax Enterprise, is now Spark-certified. The move is one of several for the NoSQL database on a possible upswing, further evidenced by a new deal with Microsoft.
- September 28, 2015
28 Sep'15
Strengthening Data Preparation with Data Virtualization
Self-service analytics allows users to design and develop their own reports and do their own data analysis with minimal support by IT. Most recently, due to the availability of tools, such as those ...
- September 22, 2015
22 Sep'15
Madsen looks at shift in big data analytics applications
At a TDWI Boston Chapter meeting, Mark Madsen says some notions of information become outdated in the face of big data analytics. This is part one of two.
- September 22, 2015
22 Sep'15
New models of processing stalk big data analytics applications
Operations and big data analytics applications are beginning to blend, causing changes in data strategies, Mark Madsen tells a TDWI Boston Chapter meeting. This is part two of two.
- September 22, 2015
22 Sep'15
Big Data Myth 2: Analytics Requires Big data
In Part 1 of this series on big data myths I indicated that the goal of most big data projects is analytics. In other words, big data systems are almost always developed to improve the analytical ...
- September 18, 2015
18 Sep'15
Big Data Myth 1: Big Data is the Goal
Big data is an incredibly popular topic. Plentiful articles, books, and blogs have been written on this topic and countless sessions have been presented that discuss some aspect of big data. I am a ...
- September 17, 2015
17 Sep'15
Low adoption nags Hadoop technology; data storytelling rises
The latest episode of BizApps Today examines barriers to Hadoop technology adoption, SQL-on-Hadoop options and the new concept of data storytelling.
- September 10, 2015
10 Sep'15
Alteryx software targets Amazon big data
The latest Alteryx analytics engine gets enterprises close to Amazon big data, with nods to spreadsheet aficionados and R language.
- September 09, 2015
09 Sep'15
Evaluating SQL-on-Hadoop tools? Start with the use case
In a Q&A, Clarity Solution Group CTO Tripp Smith says to base SQL-on-Hadoop software decisions on actual workloads. Some Hadoop tools target batch jobs, while others are intended for interactive ones.
- August 31, 2015
31 Aug'15
IBM pushes MPP engine to boost cloud database services line
IBM has added massively parallel processing and R language support to its dashDB software -- important steps for a data warehouse database tailored for distributed cloud computing.
- August 18, 2015
18 Aug'15
Finding a way off data isolation island
New technology is supposed to eliminate problems. With data isolation, new business tools are only making the longstanding issue worse. Companies are amassing on-premises tools and cloud applications. And those cloud apps come in many varieties: ...
- August 18, 2015
18 Aug'15
Vertica 'Excavator' release adds Kafka messaging and more
HP's Vertica analytics platform update will include better streaming data support via Kafka, and SQL-on-Hadoop improvements.
- August 14, 2015
14 Aug'15
DataTorrent data ingestion tool aims to speed Hadoop feeds
A new data ingestion and extraction tool supporting the Hadoop Distributed File System is at the heart of startup vendor DataTorrent's efforts to broaden its big data analytics engine's appeal.
- August 13, 2015
13 Aug'15
Don't throw out design principles when jumping in Hadoop data lake
In a Q&A, data warehousing expert Joe Caserta explains why a new generation of developers building Hadoop clusters and other big data systems may need an introduction to some fundamental rules of ETL.
- August 07, 2015
07 Aug'15
Take measured steps to build a Hadoop cluster architecture
RelayHealth's Raheem Daya described the path he took to deploy and expand a Hadoop cluster for distributed data processing during a presentation at the 2015 TDWI conference in Boston.
- August 07, 2015
07 Aug'15
MongoDB database shines; customer data analysis brings challenges
This episode of BizApps Today looks at the appeal of MongoDB database features and why companies struggle with determining a unified view of their customers.
- July 31, 2015
31 Jul'15
Stonebraker points to data curation as next integration step
Database pioneer Michael Stonebraker told an MIT conference that data curation tools offer a step forward on data integration. Others said things are already progressing through other means.
- July 31, 2015
31 Jul'15
Couchbase NoSQL database addresses mobility, synching issues
Couchbase offers an embedded version of its NoSQL database, which can fit on a smartphone. That feature turned out to be a plus for air carrier Ryanair.
- July 27, 2015
27 Jul'15
Metanautix Quest follows Dremel path to virtual data marts
Metanautix Quest 2.0 is an upgraded query engine and data integration platform. Its designers helped create Google Dremel, a precursor for new big data query tools.
- July 02, 2015
02 Jul'15
Convergence of Data Virtualization Servers and SQL-on-Hadoop Engines?
Hadoop has become a popular and powerful platform for data storage and data processing. Data stored in Hadoop can be used by a wide range of applications and tools and for a wide range of use ...
- June 30, 2015
30 Jun'15
Warehouse is purpose-built for JSON data
SonarW is a columnar data warehouse built especially to handle JSON data. It can support data warehousing for MongoDB systems and some data lakes.
- June 30, 2015
30 Jun'15
The Data Strategy is the Puzzle
In the history of IT, the number of times that IT departments proposed to top management to invest in a new technology, a new data quality program, or a new design technique, but were not able to ...
- June 16, 2015
16 Jun'15
New tools tap SQL skills in bid to boost Hadoop platform adoption
In many organizations, Hadoop is still pushing to go beyond proof-of-concept projects. Some vendors hope new tools that enable familiar SQL querying will lead it to broader adoption.
- June 15, 2015
15 Jun'15
Sales intelligence tools shape strategies, customer experience
Sales intelligence is making brainiacs out of sales reps. As companies look for ways to capture the attention of customers new and old, many are turning to analytics tools and data to help seal the deal.
Some use data to shape sales ... - June 09, 2015
09 Jun'15
MongoDB enters world of swappable data storage engines
MongoDB 3.0, which now has swappable data storage engine support, gained much attention at MongoDBWorld 2015, even as MongoDB 3.2 details emerged.
- June 08, 2015
08 Jun'15
Embarcadero XE7 data models adapt to Agile methods
Embarcadero’s XE7 ER/Studio tool suite intends to make data models a more intrinsic part of Agile development projects.
- May 29, 2015
29 May'15
CTO looks to open source big data tools to help take company higher
Big data technologies have become vital to online ad platform developer Altitude Digital. And as the tools have evolved, CTO Manny Puentes has learned some lessons about using them.
- May 29, 2015
29 May'15
Book author says big data NoSQL databases need proper application
Consultant Dan Sullivan, author of "NoSQL for Mere Mortals," discusses some of the pertinent points for IT shops to consider before entering the new world of NoSQL databases.
- May 28, 2015
28 May'15
Neo4j software update focuses on NoSQL database engine room
The latest release of Neo Technology's Neo4j graph database boosts the NoSQL software's performance and scalability through faster processing of both data reads and writes.
- May 28, 2015
28 May'15
Talking Data podcast goes to data journalism and analytics school
In this episode of the Talking Data podcast, TechTarget editors report on their experiences learning to do data journalism, including all the work needed to get data ready for analysis.
- May 28, 2015
28 May'15
Microsoft Power BI in Convergence spotlight; Apache Spark draws interest
While Power BI took center stage at Microsoft Convergence, many users are struggling with CRM basics. Also: insight on the Spark processing engine.
- April 29, 2015
29 Apr'15
SQL-on-Hadoop startup JethroData releases 1.0 engine
Startup JethroData's namesake Version 1.0 software is an index-based, SQL-on-Hadoop engine that forgoes full scans of Hadoop data sources.
- April 27, 2015
27 Apr'15
Spark helps power Paxata big data preparation platform
Count Paxata among startups set to apply machine learning to problems of big data preparation.
- April 24, 2015
24 Apr'15
New thinking needed in IT to make NoSQL data modeling process work
As NoSQL databases become more prevalent in enterprise processing, designing data models for the flexible-schema systems is a looming challenge for data architects and developers.
- April 15, 2015
15 Apr'15
How Self-Service BI is Changing Data Modeling
There was a time when data models were only touched by experts in white coats wearing soft cotton gloves while carrying them around. The data models were kept in a vault that Fort Knox would have ...
- March 31, 2015
31 Mar'15
Polyglot Persistence and Future Integration Costs
Polyglot persistence must be the ugliest term we have ever come up with in the IT industry. It means that different persistency technologies (read data storage technologies) are used to store data. ...
- March 31, 2015
31 Mar'15
Data Virtualization and Data Vault: Double Agility
Data vault is a modern approach to design enterprise data warehouses. The two key benefits of data vault are data model extensibility and reproducibility of reporting results. Unfortunately, from a ...
- March 26, 2015
26 Mar'15
For some companies, business applications in the cloud is first choice
Cloud applications are no longer curious IT experiments -- now they're often on shortlists of deployment options.
- March 25, 2015
25 Mar'15
Couchbase, VoltDB pursue more database scalability in big data apps
New database updates address large-scale applications. Included are Couchbase Server 4.0 with multidimensional scaling support and VoltDB 5.0 with links to Hadoop data streaming tools.
- March 09, 2015
09 Mar'15
New Hadoop projects aim to boost interoperability, data lake benefits
As companies get a better handle on how to use big data wisely, a new effort on interoperability for Hadoop projects gets a mixed reaction.
- March 02, 2015
02 Mar'15
MapR Hadoop update plies data center waters; Azure learns new tricks
Hadoop vendor MapR's latest release puts the focus on database replication across data centers. Also, Microsoft has built a Python-friendly Azure service for machine learning in the cloud.
- February 24, 2015
24 Feb'15
Ferment accompanies new Hadoop ecosystem initiative
The Open Data Platform has arrived, but not all Hadoop vendors are on board. The initiative, aimed at boosting interoperability, formed a backdrop for discussion at the Strata + Hadoop World 2015 conference.
- February 20, 2015
20 Feb'15
Western Union connects the dots to enable enterprise Hadoop use
Building and running enterprise Hadoop applications takes more than data crunching. First, Hadoop data must be absorbed into company processes, a Western Union IT manager says.
- February 17, 2015
17 Feb'15
Hadoop data lakes must get more efficient, less 'messy' to oust EDWs
In a Q&A, Forrester analyst Mike Gualtieri said Hadoop-based data lakes can become an alternative to enterprise data warehouses. But first, faster I/O and better data governance are needed.
- February 09, 2015
09 Feb'15
Forget meaningful use; for healthcare providers, it's all about interoperability
With digital records finally the norm in healthcare organizations, providers are today looking at how to use that data to help achieve more positive medical outcomes.
- February 06, 2015
06 Feb'15
Hadoop data lake not a place for just lounging around
A new Gartner report says the storage repository isn’t the trouble-free panacea many observers hail it to be. New data governance practices -- and new skills -- are critical.
- January 30, 2015
30 Jan'15
Initiative targets Hadoop data management, better data policies
Hortonworks forms a data management initiative with Merck and others. Meanwhile, SQL and NoSQL may blur a bit.
- January 22, 2015
22 Jan'15
Oracle GoldenGate updates; Xplenty, Segment in deal
An update to Oracle's GoldenGate replicator is on tap. And, Xplenty and Segment connect on Hadoop processing in the cloud.
- January 20, 2015
20 Jan'15
IBM works to deliver on Watson's cognitive computing promise
Watson Analytics and a Watson school competition look to bring IBM cognitive computing initiatives to the commercial sphere.
- January 16, 2015
16 Jan'15
Data management plan with integration focus drives healthy analytics
Don't let the buzz around analytics fool you. Solid data management capabilities are required to enable advanced analytics, as an example from the healthcare sector attests.
- January 06, 2015
06 Jan'15
Data warehouse software not ready for retirement in age of big data
Hadoop clusters, NoSQL databases and other modern technologies have roles to play in business intelligence and analytics environments. But traditional data warehouses still do, too.
- July 29, 2014
29 Jul'14
Agile data warehousing casts business light on dark process
At the TDWI Executive Summit in Boston, users talked about the benefits and challenges of incorporating Agile development methodologies into data warehouse and business intelligence projects.
- June 18, 2014
18 Jun'14
Apache Kafka committer Jay Kreps on the way of the log
To keep big data in motion, LinkedIn developers devised Apache Kafka, a publish-and-subscribe message dispatcher based on the concept of logs.
- June 04, 2014
04 Jun'14
Apache Spark goes 1.0, looks to improve on MapReduce performance
The Apache Software Foundation released Version 1.0 of Spark, a processing framework aimed at outpacing MapReduce for machine learning and other uses.
- May 22, 2014
22 May'14
Former shadow IT worker helps bring analytics data into the light
Ryan Fenner worked in a shadow IT unit at Union Bank for years. Now he's helping lead an IT effort to better manage BI data created in those shadows.
- May 08, 2014
08 May'14
Big data charge could turn data management profession on its head
Discussed at DAMA's EDW 2014 were more flexible approaches to the data management profession. That's in response to growing mountains of diverse data.
- April 10, 2014
10 Apr'14
Think Big Analytics' Bodkin: Big data adoption means breaking silos
Hadoop has the ability ultimately to break down siloed data stores, Ron Bodkin says in a Q&A.
- March 31, 2014
31 Mar'14
Chief information architect: A compass for information management
In a Q&A, author William McKnight discusses the importance of building a robust information architecture -- and having the right person in the lead.
- March 31, 2014
31 Mar'14
Mix of data management platforms tapped to support big data projects
Consultant John Myers discusses key trends in managing big data environments, including adoption rates for Hadoop clusters and NoSQL databases.