● New Dremio survey of enterprise IT professionals finds data lakehouses are the primary architecture for delivering analytics, with 65% running a majority of analytics on lakehouses
● Over half are saving more than 50% on analytics, and 81% are using a data lakehouse to support work on AI models and applications
Dremio, the easy and open data lakehouse, today announced the release of its survey findings and full report, The State of the Data Lakehouse, 2024. The report offers fresh insights from 500 full-time enterprise IT and data professionals on data lakehouse adoption, open table format trends, data mesh implementation for self-service analytics, and AI’s impact on the lakehouse and beyond.
Data lakehouse adoption is on the rise and cost savings are key
The data lakehouse is fast becoming the primary architecture for delivering analytics. With 65% running a majority of analytics on lakehouses now, survey respondents cited cost efficiency and ease of use as the top reasons.
- 70% of respondents say more than half of all analytics will be on the data lakehouse within three years, and 86% said their organization plans to unify analytics data.
- Over half (56%) expect they are saving more than 50% on analytics by moving to the data lakehouse; almost 30% of respondents from large enterprises with more than 10,000 employees expect their savings are greater than 75%.
- 42% moved from a cloud data warehouse to the data lakehouse—more than from any other environment. Top reasons for the shift were cost efficiency and ease of use.
Open table formats are transformative and Apache Iceberg is quickly gaining momentum
Amid the generative AI frenzy, a quieter revolution has been taking place: Open table formats—a foundational component of data lakehouses—are bringing full SQL functionality directly to the data lake. This enables organizations to move away from decades-old data warehouse architectures and their associated inefficiencies.
The survey found that Apache Iceberg and Delta Lake are clearly the leading open table formats. The survey confirmed Iceberg’s growing popularity. While 39% of respondents are currently using Delta Lake, compared to 31% who are using Iceberg, 29% adopting an open table format in the next three years plan to choose Iceberg, compared to 23% for Delta Lake.
Respondents cited multiple factors that influenced their choice of a particular table format including: performance (77%), compatibility with specific tools or platforms (72%), specific features (62%), and an open ecosystem (59%).
Data mesh is at the heart of digital transformation and is driven more by business units
Full or partial data mesh implementations are happening at most enterprises and expansion is expected by nearly everyone, according to the survey. As a key technology enabling the success of data mesh strategies, the data lakehouse is making self-service analytics, domain-driven data ownership, data as a product, and federated governance a reality for teams on the ground.
- 84% of respondents have fully or partially implemented data mesh, and 97% expect data mesh implementation to continue to expand in the next year.
- Top objectives of implementing a data mesh are improved data quality (64%) and data governance (58%); almost half or just over half of respondents also named agility, scalability, improved data access, and improved decision-making.
- Data mesh initiatives are driven more by business leaders and business units (52%) than by centralized IT teams.
The data lakehouse is critical in the AI era
The data lakehouse is already improving AI-driven data management, governance and compliance, as well as the work lives of IT professionals.Data self-service, which data lakehouses enable, is fundamental for AI development, as the vast majority of respondents say their enterprise is using a lakehouse to support data scientists building and improving AI models and applications. With respect to job-related issues, a majority cited manual repetitive processes and manual data merging and reconciliation as problems, speaking to the need for more automation and AI-assisted data management and governance.
- 81% of respondents are using a data lakehouse to support data scientists building and improving AI models and applications.
- 62% disliked manually merging and reconciling data from multiple sources, repetitive manual processes, and cleaning up raw data.
- Technical professionals overwhelmingly agree AI is a national security priority (84%), noteworthy in light of the recent U.S. executive order on AI.
Visit AITechPark for cutting-edge Tech Trends around AI, ML, Cybersecurity, along with AITech News, and timely updates from industry professionals!