Request a 30-minute demo

Our product expert will guide you through our demo to show you how to automate testing for every part of your workflow.

See data diffing in real time
Data stack integration
Discuss pricing and features
Get answers to all your questions
Submit your credentials
Schedule date and time
for the demo
Get a 30-minute demo
and see datafold in action

How a leading healthcare staffing firm migrated MySQL to Snowflake 6x faster with Datafold

About

One of America's largest healthcare staffing firms cut their MySQL-to-Snowflake migration time from 6 months to 1 month using the AI-powered Datafold Migration Agent. They reduced costs by 80% while ensuring perfect data parity across 1,000+ custom queries.

Data team size
5
Total Employees
1000+
Data stack
MySQL
Snowflake
Tableau
GitHub
dbt Cloud
Key metrics
60%
Reduction in contractor reliance
6x
Faster migration timeline (from 6 months to under 1 month)
80%
Cost savings on migration

Introduction

One of America's largest healthcare staffing firms cut their MySQL-to-Snowflake migration time from 6 months to 1 month using the AI-powered Datafold Migration Agent. They reduced costs by 80% while ensuring perfect data parity across 1,000+ custom queries.

Customer quote

The challenge

The company maintained a critical reporting system built on MySQL. These reports, which are powered by complex queries that transform data ingested from upstream sources, provided essential credentialing information for healthcare professionals, ensuring they could legally practice in specific locations. The firm needed to migrate these reports to Snowflake without any service disruptions to customers.

The migration presented several complexities:

  • Approximately 1,000 custom MySQL queries needed translation
  • Many of the queries contained complex joins and subqueries requiring careful translation
  • Reports had to produce identical results in both environments to maintain data parity
  • Manual validation of each query would require enormous effort and introduce risk
  • Some queries exhibited special handling requirements, including non-deterministic behaviors

The company’s reporting ecosystem had evolved organically over the years, making it challenging to identify and categorize all the queries that required migration.

Initially, the data team considered a traditional approach, using contractors for manual SQL translation and validation. They estimated that the contractor approach would take 6-9 months to complete. But given the critical nature of these reports and the resource and timeline required by contractors, this approach simply didn’t align with their business objectives and timelines.

The solution

The company leveraged Datafold's AI-powered Data Migration Agent (DMA), a full-cycle migration solution that addressed their key challenges:

AI-powered SQL translation with continuous refinement

Datafold deployed DMA to translate MySQL queries to Snowflake syntax, with outputs validated using Datafold's proprietary data diff technology. Unlike traditional translators that rely on static code parsing and perform no validation of outputs, Datafold's DMA leverages large language models (LLMs) with a validation-driven feedback loop that intelligently identifies and corrects translation errors. This approach handled complex SQL patterns that simple translation tools would miss, creating a self-improving system that continuously refined the code until achieving parity.

Cross-database diffing for comprehensive validation

DMA’s one-two punch is the combination of translation with Datafold’s industry-leading cross-database data diffing that provides value-level comparison between source and target environments. For the company, this revealed detailed schema comparison, summary statistics, and row-level differences between the outputs of the original queries and the translated Snowflake versions. This automated validation eliminated the need for manually checking innumerable data points across reports and provided clear evidence of migration success.

Specialized handling for edge cases

For queries with special characters or non-deterministic behavior, the Datafold team implemented specialized handling techniques for DMA that addressed MySQL-specific behaviors. This ensured consistent results even when dealing with invisible characters, trailing spaces, and dialect-specific functions.

Batched delivery through pull requests

Datafold delivered translations in batches through pull requests to the company’s GitHub repositories, allowing for continuous integration and testing throughout the project. Continuous delivery allowed for fast feedback loops and quickly built confidence among executive stakeholders that the migration was significantly accelerated.

The results

This leading healthcare staffing firm realized significant benefits from the Datafold implementation:

Timeline reduction from months to weeks

According to the team, Datafold compressed what would have been a months-long project into just weeks. The automated approach allowed the team to parallelize work that would have been sequential if they’d instead used contractors, dramatically accelerating their timeline while maintaining quality.

Data validation at massive scale

The team reported that automated cross-database validation provided a level of confidence that simply wasn't possible with a manual validation process. With Datafold, they could verify that every value matched perfectly between systems, eliminating the possibility of subtle data discrepancies that could impact critical decisions.

Redirecting internal resources to high-impact work

Ultimately, the team was able to reduce their contractor needs by over 60% while simultaneously accelerating the project. This allowed them to redirect budget and internal resources to strategic initiatives rather than migration work.

Looking ahead

Beyond the initial migration, the company is implementing additional Datafold capabilities. The team quickly realized Datafold's capabilities extended far beyond their initial migration use case. Data lineage, automated testing, and observability and monitoring tools are helping them build a more robust data ecosystem for the long term.

  1. CI/CD Integration: Automatically testing dbt model changes to predict downstream impacts before deployment
  2. Data Quality Monitoring: Setting up observability for critical metrics with anomaly detection
  3. Impact Analysis: Using lineage to trace how changes affect downstream Tableau dashboards

What began as a migration tool has evolved into a strategic platform for ensuring data quality across the company’s entire analytics stack.

Migration Automation