Data sprawl and legacy IT architectures can make it difficult for business and tech leaders to unify data assets in support of digital transformation initiatives. Yet establishing a standardized data delivery and access mechanism is critical when data is located across on-premise, public, private, and multi-cloud environments.
Not only do you need data to drive innovation, but for most industries, you need to prove that the way you handle that data adheres to compliance requirements. So, companies are increasingly searching for technology that is flexible to support diverse platforms and secure as well.
Cloudera has developed a data platform that provides rapid insights by connecting on-premise data sources with public and private clouds, enabling users to access data readily, all with built-in governance and compliance features.
Cloudera is on the Acceleration Economy Top 10 Shortlist of Data Modernization Enablers.
Who They Are
Cloudera was founded in 2008 by a quartet of well-known tech-industry figures: Google’s Christophe Bisciglia, Yahoo!’s Amr Awadallah, Jeff Hammerbacher from Facebook, and Oracle’s Mike Olson.
Cloudera was developed as an analytics platform based on the Apache Hadoop software framework. In 2019, Cloudera merged with Hortonworks, another Hadoop developer, and the company was unified under the Cloudera brand. The company is headquartered in Palo Alto, Calif.
In 2021, Cloudera was privatized after a stint on the New York Stock Exchange. The company was acquired by Clayton, Dubilier & Rice, and KKR for approximately $5.3 billion. Today, Cloudera has over 2,000 customers in 85 countries.
Which companies are the most important vendors in data? Check out the Acceleration Economy Data Modernization Top 10 Shortlist.
Robert Bearden is Cloudera’s CEO. Co-founder and former CEO of Hortonworks, Bearden has an illustrious history in management positions in the open-source software market.
Ram Venkatesh is Cloudera’s CTO. Venkatesh guides the company’s technical vision and has over 25 years of experience in the enterprise software industry. Kevin Cook is CFO at Cloudera. Before Cloudera, Cook held roles at Credit Suisse, Wachovia, and RBC Capital Markets.
What They Do
Cloudera’s core product is the Cloudera Data Platform (CDP). The CDP is a hybrid data cloud platform that enables companies to unify data from any source: structured or unstructured data in public or private clouds, on-premise data centers, or even machine data. After ingesting data, CDP enables companies to manage the availability of that data across the enterprise.
Core elements of CDP include data mesh, data fabric, and data lakehouse. More details on each of these follows:
Data mesh: Cloudera’s data mesh architecture uses a technology called Cloudera Data Flow for universal data distribution. Users can connect to any source using any method with low-code functionality. With the data mesh, users can unify data from diverse cloud environments and individual domains, then create and manage data independently in a unified platform.
Data fabric: Cloudera’s data fabric architecture supports hybrid data management. The core elements of this data fabric include a comprehensive data catalog to aid data governance and encourage self-service. Users can continuously observe and understand their data with a recommendation engine providing suggestions and insights. Data movement and replication are traceable and secure. Ultimately, the Cloudera data fabric provides a single view across the entirety of an organization’s data estate. Cloudera Shared Data Experience (SDX) is the approach Cloudera takes to apply security and governance to all data. SDX is part of the CDP architecture, enabling safe, compliant data use and governed access by design.
Open Data Lakehouse: Cloudera’s Open Data Lakehouse gives users access to all data wherever it resides in order to perform analytics at scale. Regardless of whether data is stored in a public or private cloud, the data lakehouse enables users across an organization to execute analytics on the same data using their preferred tools and methodologies without the requirement to move or lock it.
“Cloudera offers a suite of tools that meet many data engineering and data analytics needs. And as a working CIO, I prefer a vendor that supports diverse environments as our needs change. That’s where Cloudera comes in. Whether my data is on-prem or across industry clouds, they’ve got me covered. Cloudera’s breadth of tools and platforms means fewer things to worry about.”Wayne Sadin, CIO and Acceleration Economy data modernization analyst
Who They’ve Impacted
HelloFresh, the on-demand grocery company, has grown exponentially in recent years and now provides pre-packed ingredients for 10 million monthly meals. One of the company’s core objectives with analytics is to predict which ingredients will be popular so it can optimize supply relative to demand. However, as the company’s customer base grew, its existing data platform failed to meet these requirements.
“Our legacy SQL database was hard to scale and difficult to bring in new data and users,” said Kai von Grambusch, director of Data & Analytics at HelloFresh in a Cloudera case study. “It also had slow performance. It would take hours to figure out how many boxes were shipped, and as we hired more analysts and data scientists, it became quite difficult even to maintain that level of performance.”
Using a Cloudera running on AWS, HelloFresh analyzes over 15TB of data. One hundred-plus business users now run thousands of queries a day and have access to over 2,500 up-to-date business intelligence dashboards.
“Before we implemented Cloudera, we could only analyze about 5 TB of data,” said von Grambusch. “In less than a year, we’ve grown to analyze three times the amount of data, and we’re adding approximately one TB a month to the system. This includes unstructured and structured data on how customers interact with our products and service staff, and what feedback they provide in surveys.”
As well as providing HelloFresh with the insights it needs to develop new business opportunities and products, the data is increasing customer retention because HelloFresh better understands individual preferences.
Why Cloudera is on the Data Modernization Top 10 Shortlist
Cloudera is enabling companies to access and operationalize data from anywhere in their data estate, securely and at scale. Here’s why our practitioner analysts selected Cloudera to be on our Top 10 Shortlist of Data Modernization Enablers.
- Cloudera enables companies to utilize data from hybrid and multi-cloud environments with unhindered data access and management.
- The platform integrates security and governance by design, and that’s a real differentiator vs. data platforms that incorporate data governance as a plug-in
- Cloudera utilizes the most cutting-edge data technologies — data fabric, data mesh, and data lakehouse — to flexibly support company structures, data architectures, and emerging data challenges
- Customers including HelloFresh validate Cloudera’s positioning and the value of its technology in supporting complex analytics requirements.
Looking for more insights into all things data? Subscribe to the Data Modernization channel: