As more businesses move their on-premises databases to the cloud, many face a challenge that may take months or even years to work through.
How do they move terabytes, petabytes, and even exabytes of data to the cloud?
The issue is one that database administrators and IT pros have long grappled with, but it’s gaining urgency as businesses accelerate the transition to cloud databases. Gartner has forecast that 75 percent of all databases will be deployed or migrated to the cloud by 2022.
The complexities of database migration were highlighted unexpectedly on Snowflake’s earnings call for Q4 FY2021 when CFO Mike Scarpelli was asked about the company’s net revenue retention (NRR) rate, which was holding steady at 168%. Scarpelli explained that NRR was slow to change because it can take customers six months or longer to begin using Snowflake’s Data Cloud.
Snowflake CEO Frank Slootman added that database migrations have been hard “since time immemorial.” He described database migrations as lengthy, expensive, and risky, and said “customers are quite leery of them.”
Snowflake has tools and expertise to automate much of the process, but there is often other work involved in getting a project across the finish line, such as data mapping, workload optimization, and dealing with “proprietary artifacts” of legacy systems, Slootman said. It’s “not just a matter of throwing a big switch and hoping for the best,” he said.
Kelly Stirman, Google Cloud director of product management for databases, echoed that yellow-flag sentiment following Google Cloud’s announcement on March 31 that its Database Migration Service was generally available. “Most IT organizations view database migrations as about the highest risk thing you can do in all of IT,” said Stirman.
In fact, Snowflake may have lost at least one potential customer as a result of the heavy lifting involved in switching to its cloud platform. Teradata CEO Steve McMillan said in February that an existing customer that had planned on moving to Snowflake reversed course and decided to stay with Teradata after facing “technical challenges and migration delays.”
New tools can help
Cloud database vendors are offering a growing array of tools, services, and how-to resources to smooth the process. Google Cloud describes its new Database Migration Service as a “single-click” experience, a reference to its ease of use. In its initial release, Google Cloud DMS supports the popular PostgreSQL and MySQL databases, with plans for Microsoft’s SQL Server later this year.
Google Cloud DMS supports homogenous, or same-to-same, database migrations. For example, it can be used to migrate an on-premises PostgreSQL database to Google Cloud’s PostgreSQL (which is an option with Google Cloud’s Cloud SQL service).
Database migrations get more complicated when they involve different source and target databases—say from an Oracle database to PostgreSQL—or when there are hundreds of terabytes to move.
Ashish Yajnik, senior VP of product management with Teradata, said simple, straightforward data migrations can sometimes be accomplished in just a few days. However, “data is not isolated,” he added, and projects can take several months when workloads, applications, and analytics tools are factored into the process.
Large scale database migrations typically require even more time. For customers with many petabytes of data across the enterprise, it may take two to three years to complete a wholesale move to the cloud, Yajnik said.
Here’s an example of that: AWS recently said it took two years to move 7,500 Oracle databases with a total of 75 petabytes of data, used by Amazon.com, to AWS’s own Aurora and special-purpose databases.
Data migration on 18 wheels
AWS relied on its Database Migration Service and Schema Conversion Tool for that two-year database migration undertaking. AWS CEO Andy Jassy said in December that the company’s Database Migration Service had been used by customers to move 350,000 databases to AWS.
AWS also has its “Snow” family of portable data transfer solutions, from Snowcone units, which are small enough to be carried by a drone, to Snowmobile, a portable storage container on 18 wheels that is used for mega projects with hundreds of petabytes or exabytes of data.
Microsoft, Teradata, and other cloud database vendors have their own database migration offerings. SAP says its Database Migration Factory can help customers with migration strategy, cross-platform integration, and risk management.
In March, Oracle announced Oracle Cloud Lift Services, a program with technical tools, engineering resources, and tech support to help customers move data warehouses and other workloads to Oracle Cloud Infrastructure.
What else can businesses do to manage these often complex but critical database migrations? Hybrid cloud environments, where workloads can be distributed between public and private clouds for as long as necessary, help mitigate the risk, said Teradata’s Yajnik.
But the bottom line is that there is no margin for error with corporate data. Businesses want assurances, said Slootman, that “the integrity is 100% maintained.”
For more on tools and strategies for cloud database deployment, subscribe to the Cloud Database Report.
RECOMMENDED READING
Who’s #1 Cloud Database: Oracle, AWS, Microsoft? Cockroach, Databricks, Mongo?
Surging Cloud Databases Will Blow Past Legacy Databases in Mega Platform Shift
Snowflake: 4 Big Steps on Journey to $1 Billion in Data Cloud Revenue
The Cloud Database Market Is Booming: 10 Key Developments
Can Oracle Beat AWS and Snowflake in the Cloud Database Wars?
Subscribe for free to the Cloud Database Report for timely news, insights and expert interviews.