Migrating data is easy, migrating code is hard.
While many assume data migrations are primarily about moving data between platforms, the real challenge lies in migrating code. Data movement has become relatively simple thanks to modern tools, often taking just hours or days. However, code migration remains complex and time-consuming due to different SQL dialects, scattered codebases, and the need for extensive validation. This traditionally requires months or years of manual work, but Datafold's Migration Agent (DMA) is revolutionizing this process by using AI to automate code translation and validation, dramatically reducing migration timelines from years to weeks.

One of the most common misconceptions about data migrations is that the focus and pain of them is around moving the data between platforms. In reality, much of the pain, time, and manual work of a migration is focused on migrating the code.
Done in a day: Moving data
Despite what leadership or non-technical business consumers may believe, moving data is one of the easiest parts of a migration. Today, a variety of both open source and SaaS tools like Airbyte and Fivetran, as well as data platform-native capabilities, allows teams to move data across systems with incredible speed. We’re at the point where data teams can move an entire data warehouse’s worth of data to a new database as quickly as a few hours or days.

Done in not a day : Moving code
Most data in an analytical database is a string, numerical type, date, bool, or JSON-like object, easily transferable between databases. On the other hand, code is tightly coupled with the engine for which it's been written. Even pure SQL must be translated across dialects before executing on a different engine.
This is typically where things in a migration start getting complex: migrating transformation code (e.g., stored procedures, SQL scripts, or even GUI-based XML) to your new platform. There are many reasons why this is a tasking project for data teams during a migration, but the biggest two I’ll focus on here is the fact that:
- There’s often so much code
- Data is portable, and code is not
Real-world enterprise data migrations often involve untangling a gigantic mess of code spanning multiple languages, shell scripts, rigid procedural SQL, GUI-based ETL mappings, and business logic defined in tools you never thought still existed. And often at considerable scale: thousands of SQL models or GUI-based transformations that need to be moved over. Even for data teams on a smaller scale, translating hundreds of models can take months.

Unlike data that can more easily be moved and uploaded into a new platform, migrating the codebase often requires multiple laborious steps:
- Gathering all code, since it’s likely not in one single repository pre-migration.
- Mapping out dependencies, both on the data flow-level and cell-level (warning: this alone can take months).
- Breaking down the lineage graph into chunks, and map those between engineers working on the migration.
- Translating, refactoring, optimizing each chunk of code for the new platform.
- Validating the data between legacy and new code to ensure parity (warning: this also can take months).
- If there are data integrity issues (which there likely will be), going back to step #4 and repeat until parity is met.
This repeated process takes hours or days per object. In reality, this means thousands of hours of work at the enterprise scale, requiring million-dollar budgets, year-long timelines to complete, and often outsourced resources.
Not only are migrations expensive in terms of actual dollars, but data platform migrations come at a great opportunity cost: near-term innovation and meeting business expectations as a data team. Doing all of these things at once—migration, innovation, meeting baseline needs—is a challenge for any data team.
.png)
The good news is that we’re thinking about migrations a little bit differently at Datafold: we believe that taking  modern AI-native agentic approach to data platform migrations can compress the data platform migration timelines from years to weeks by automating analysis, translation, and validation in a 100% software-driven solution.
DMA: The AI-native agentic approach to migrations
The Datafold Migration Agent (DMA) challenges the status quo of migrations. By combining advanced AI, sophisticated LLMs, and precise data diffing technology, we've created something that many thought impossible: a way to automate the most painful parts of migrations while maintaining perfect data integrity; DMA takes the two most manual, taxing, and time-intensive parts of a migration—code translation and data validation—and automates them at previously inconceivable speeds.

At its core, DMA uses AI and LLMs to automatically:
- Map code dependencies
- Translate legacy code to new SQL dialect and frameworks (e.g., dbt, Airflow, Coalesce.io)
- Validate that the new SQL outputs match exactly what the legacy code did
- Provide a code repository or PR of the translated and validated code
- Provide an auditable log of data diffs—value-level comparisons of tables across databases—to prove data parity
What used to take months or years of manual work can now take a fraction of the time; DMA continues to push the boundaries of how data teams can approach and execute migrations to ultimately allow them to modernize faster and focus on higher-leverage work.
With ETL tools like Airbyte and Fivetran handling data movement with speed, and DMAÂ focused on migrating and validating the code, data teams can now tackle data migrations with great urgency and integrity.