When something goes wrong with a data pipeline, it's often difficult for organizations to figure out what happened and how to fix it.

That's the challenge that Monte Carlo, based in San Francisco, is looking to solve with its Incident IQ technology, which the vendor added to its Data Observability Platform on July 14.

The new Incident IQ capability helps organizations identify data pipeline problems such as an outages or latency and provides insight that can help remediate the root cause. The Monte Carlo Data Observability Platform works with data from a variety of sources including data lakes, data warehouses and event streaming.

Among the data observability startup's users is online consignment and thrift shop ThredUp, based in Oakland, Calif.

Satish Rane, head of data engineering at ThredUp, said the firm uses data from multiple sources including transactional and behavioral data from consumers, as well as data that comes from the company's operations.

ThredUp also relies on streaming data via Apache Kafka. He said that when he joined ThredUp a year ago, he noticed that that a number of different groups at the company were introducing data, beyond just the data team.

"We had a decentralized approach to onboarding data," Rane said. "There are sacred things which the data team owns, which are critical for the finance side, and there are all these other on-boarders of data that probably do not go through the same regimen of what the data engineering team goes through."

Monto Carlo is now providing its users with more insight into data downtime incident to help organization fix issues.

The challenge for Rane and his team was understanding the source of data, as well as its structure and quality. There were also a lot of data pipelines that the company was running where it wasn't entirely clear where they were being used, whether it was for analytics, business intelligence or operations within the company.

"With Monte Carlo, first of all what it did was really give us a pulse of our data, whether the data made sense and if something was not right," Rane said.

Monte Carlo Incident IQ boosting data observability ThredUp has been testing the Incident IQ feature in the Monte Carlo Data Observability Platform and for Rane, it has been a positive experience so far. "From the data engineering side, you look at the incident, and then all in one place you are able to see everything, like upstream and downstream dependencies and what was the root cause, right down to the piece of code," Rane said. Lior Gavish, CTO of Monte Carlo, explained that typically when a data pipeline is broken, it requires a certain amount of time to find the problem and then even more time to figure out how to fix it. Gavish said that Monte Carlo's platform had previously provided visibility into the health of data pipelines. With Incident IQ, Monte Carlo is going beyond spotting problems to providing more insight to help users quickly fix problems. With Monte Carlo, first of all what it did was really give us a pulse of our data, whether the data made sense and if something was not right. Satish RaneHead of data engineering, ThredUp