SAP to acquire data lakehouse vendor Dremio

SAP on Monday announced plans to acquire Dremio, which bills itself as an agentic lakehouse company, for an unspecified price. The move is complicated by similar offerings from existing SAP partners Snowflake and Databricks, but analysts point to key differences with Dremio, especially in its ability to work with data while it sits in the enterprise’s environment, rather than having to live externally.

One of SAP’s justifications for the acquisition is that it will theoretically make it easier for IT executives to combine SAP data with non-SAP data. But its strongest rationale involves Dremio’s ability to make complex data more AI-friendly, so that it can more quickly and cost-effectively be made usable. 

“Most enterprise AI projects fail to deliver value not because of the AI itself, but because the underlying data is fragmented, locked in proprietary formats and stripped of the business context that makes it meaningful,” the SAP announcement said. “The result is a familiar and costly pattern: pilots that cannot scale, slow integration of new data sources, duplicated engineering work and compliance risk when organizations cannot explain how an AI-driven decision was reached. Dremio helps eliminate that data fragmentation and integration friction.”

While SAP is citing the data quality argument, there are many elements of enterprise data quality, including data that is outdated, from unreliable sources, or that exists without meaningful context that aren’t addressed by Dremio.

However, SAP said, “With Dremio, SAP Business Data Cloud will become an Apache Iceberg-native enterprise lakehouse that unifies SAP and non-SAP data to power agentic AI at enterprise scale. Apache Iceberg is the industry-standard open table format, and SAP Business Data Cloud will natively support it as its foundation.” This means that there need be no data movement or format conversion; SAP and non-SAP data “can coexist on the same open foundation, with federated analytical reach across every enterprise data source.”

Complicated comparison

Analysts and consultants said that any comparison of Dremio to existing SAP partners Snowflake and Databricks is complicated. For example, Dremio is younger and less established than either Snowflake or Databricks, which suggests that it is a less ideal match for enterprises. 

SAP strategy specialist Harikishore Sreenivasalu, CEO of Aarini Consulting in the Netherlands, said that both Snowflake and Databricks would have been ideal acquisition targets many years ago, but they would be far too expensive today. 

“Databricks and Snowflake are better [for enterprise IT] for sure because they have a mature platform, they do multi cloud” whereas Dremio “is the new entrant in the market and they have to mature more to be enterprise ready. Their security aspects need to mature,” Sreenivasalu said.

But Sreenivasalu added that the situation could easily change after SAP invests and works with the Dremio team. He advised CIOs to “stick with where you are today but watch how technologies get integrated. Listen to the SAP roadmap.”

In a LinkedIn post, Sreenivasalu said the move still is very positive for SAP: “This is the missing piece. SAP has Joule. SAP has BTP. SAP has the business processes. Now it has the open data fabric to feed AI agents the context they need to act, not just answer. For those of us building on SAP BTP + Databricks + SAP BDC, this is a signal: the lakehouse and the ERP world are converging, fast. The future of enterprise AI just got a whole lot clearer.” 

Addresses LLM limitations

During a news conference Monday morning, SAP executives focused on how this move potentially addresses some of the key large language model (LLM) limitations with enterprise data, especially with predictive analytics.

Philipp Herzig, SAP’s chief technology officer, said that LLMs have various limitations, noting, “LLMs don’t deal really well with numbers” and that they struggle with structured data “where we have a lot of differentiation.” 

The practical difference is when systems try to predict the future as opposed to analyzing the past, such as when asking how well a retailer’s product will sell over the next 10 months, or predicting likely payment delays and their impacts on projected cashflow. “This is where LLMs struggle a lot,” Herzig said. He also stressed that Dremio’s ability to work with enterprise data while it still resides in that organization’s on-prem systems is critical for highly-regulated enterprises. 

Local data difference

Flavio Villanustre, CISO for the LexisNexis Risk Solutions Group, also sees the ability to handle data locally as the big draw.

Databricks and Snowflake both offer strong functionality, he pointed out, but users must move the data to their platform and reformat it. After this is complete, the result is a central data lake to address data access needs. “Dremio, on the other hand, provides easy decentralized data access, allowing users to access their data in place,” he said. “Of course, this could be at the expense of data processing performance, but the ease of use and flexibility could outweigh the performance loss.” Implementation speed in days versus weeks or months is another plus, he added. “There is a significant benefit to that.”

Sanchit Vir Gogia, chief analyst at Greyhound Research, agreed with Villanustre, but only to a limited extent. 

“The distinction is not as clean as ‘Dremio lets data stay in place, while Snowflake and Databricks require everything to move,’” he noted. “Snowflake and Databricks have both invested significantly in external data access, sharing, open formats, governance layers, and interoperability. So it would be unfair to describe either as old-style ‘move everything first’ platforms.’” But, he added, the broader argument is correct. “[Dremio] starts from the assumption that enterprise data is already distributed and that the first problem is often access, context, federation, and governance, not wholesale relocation. For SAP customers, that matters a great deal,” he said.

That’s because of the nature of many of SAP enterprise customers’ datasets. 

“Most large SAP estates are not clean, centralized data environments,” he pointed out. “They are brownfield landscapes: SAP data, non-SAP data, legacy warehouses, departmental lakes, regional repositories, acquired systems, partner data, and industry-specific platforms.” While telling these customers that AI-readiness begins with moving everything into one central platform may be good for the vendor, it’s a lot of work for the buyer.

Dremio gives SAP “a more pragmatic story,” Gogia said. “It allows SAP to say: keep more of your data where it is, access it faster, apply more consistent catalogue and semantic controls, and bring it into Business Data Cloud and AI workflows without forcing a major migration program upfront.”

Aman Mahapatra, chief strategy officer for Tribeca Softtech, a New York City-based technology consulting firm, noted that an acquisition of either Snowflake or Databricks would obliterate SAP’s marketing message/sales pitch.

“SAP did not buy a data warehouse. They bought a position in the open table format wars, and the timing tells you exactly why Snowflake and Databricks were never realistic targets,” he said. “Acquiring either would have collapsed SAP Business Data Cloud’s neutrality story overnight and alienated half the customer base in either direction. SAP’s strategic position depends on sitting above the warehouse layer rather than inside it, and Dremio is the federated layer that talks to both Snowflake and Databricks without requiring SAP to pick a side.”

Assume things will change

Mahapatra urges enterprise CIOs to be extra cautious. 

“For IT executives with active Snowflake and Databricks contracts this morning, nothing changes in the next two quarters, but by the first half of 2027, expect SAP to steer net-new AI workloads toward Business Data Cloud regardless of what the partnership press releases say today. The CIOs who plan for that trajectory now will negotiate from strength,” Mahapatra said.

Compute and storage that data warehouse vendors provide is rapidly becoming a commodity, he said, and the “defensible value” in enterprise AI is migrating up the stack to the semantic layer, the catalog, the lineage graph, and the business context that lets an agent know what ‘active customer’ means within an organization.

“SAP just bought the toolkit to own that layer for any company running SAP at the core,” he said. “If you are an SAP-heavy shop running analytics on Snowflake or Databricks, your warehouse vendors are about to feel less strategic and more like high-performance compute backends.”

Corrects a strategic error

Jason Andersen, principal analyst for Moor Insights & Strategy, noted that for quite some time, SAP has been relentlessly encouraging enterprises to host all of their data within SAP systems. SAP can’t reverse that position even if it wanted to. 

What the Dremio deal does, Andersen opines, is to instead address the pockets of data that many enterprise CIOs, especially in manufacturing and highly-regulated verticals, have refused to turn over to SAP. The Dremio deal gives SAP a face-saving way to get an even higher percentage of its customers’ data, he said. 

“Manufacturing is loath to put things in the cloud and [manufacturing CIOs] put up a violent protest [against] going into the cloud,” Andersen said. “This [acquisition] lets SAP access a lot of data that hasn’t yet moved to SAP.”

Shashi Bellamkonda, principal research director at Info-Tech Research Group, said he sees the SAP Dremio move as fixing a strategic error that SAP made years ago, when it did not develop its own Apache Iceberg capabilities. 

“Apache Iceberg is an open-source table format designed for large-scale analytical datasets stored in data lakes, a kind of bridge between raw data files and analytical tools,” Bellamkonda said. “[SAP] should have done this earlier rather than waiting till 2026.”

This article originally appeared on CIO.com.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top