Inside a Migration that Became a Nightmare
Gleb Mezhanskiy, a former Data Engineer at Lyft and the CEO and co-founder of Datafold, shares hard-earned lessons from a migration project that dragged on for years, faced countless setbacks, and ultimately redefined his approach to data engineering.
"Imagine spending two years on a project where everyone thinks it’s a failure," Gleb Mezhanskiy, former data engineer at Lyft and now founder of Datafold, reflects. "It’s a very real career risk." Data migrations, often seen as tedious, high-stakes undertakings, have earned their reputation as career killers. Gleb has spoken about this migration before, but in this interview, he shares the full story—what went wrong, the lessons he took from it, and how automation can transform migrations from career risks into career-boosting opportunities.
In this conversation, Gleb shares hard-earned lessons from his time at Lyft overseeing an embattled migration from Redshift to Hive, highlights the unexpected challenges of migrations, and how automation will reshape what’s possible in the field of data engineering.
The migration that went badly wrong
When Gleb decided to re-architect Lyft’s data model during a migration, he learned firsthand the consequences of scope creep and misaligned priorities. Here, he candidly shares the struggles his team faced, from endless stakeholder meetings to the painstakingly slow iteration speed of Hive, and offers insight into how teams can avoid similar pitfalls.
Datafold: What was the biggest unexpected challenge you faced during the migration?
Gleb: One of the bigger mistakes I made was not doing a lift-and-shift. A lift-and-shift means taking the data as is, moving it to a new system, and not changing the interface or schema or trying to optimize things too much. But I felt that our data model was bad. There was a lot of redundancy, a lot of overlapping tables, and people wanted better models. So I thought, "Okay, this is the time to actually build a proper data model from scratch."
At the time, Lyft was a 3,000-person company, so it was quite big. We had hundreds of tables and hundreds of people using the data. I initiated the remodeling process thinking, "No big deal. I’ve read the Kimball book on data modeling; I can do this. Easy peasy." And it was a disaster.
Instead of making progress on the migration, I got stuck in endless meetings with stakeholders, trying to flesh out a better data model. We were arguing about column names, what columns should be there, and just getting nowhere. That was bad.
I managed to get some buy-in for certain remodels. But once we implemented them in the new system, I had to go through another round of convincing people that this was better. They’d say, "Wait, I’m using this table, and now it works differently? What the heck? I need to change my queries and figure out how this new table works." It became a huge time sink for everyone.
That was a big mistake that led to an unexpected blow-up of scope. It’s probably the biggest mistake I made during the migration.
Datafold: That’s rough. I can’t imagine it getting worse than that.
Gleb: Another unexpected challenge was the iteration speed was extremely slow. Translating the code wasn’t the issue—getting code translated from Redshift to Hive wasn’t where we spent the most time. We spent most of our time debugging.
For example, I’d have a query that now technically runs on Hive, but it doesn’t produce the same output. Why? I have no idea. So, I’d have to write a ton of queries to figure out where the discrepancies were, fix them, and re-run the process. Once I finally got it to match, I’d deploy it—but then I’d have to convince people to use it and switch from using Redshift to Hive.
But now the problem was they didn’t trust me that it was as good. They’d say, "Well, show me proof that it’s the same." I’d reply, "I’ve already run tests, and it is." But they’d reject that and say, "No, last time I checked, it wasn’t." And I’d think, "Oh my god."
So, we ended up building notebooks full of queries—basically manually written diffs—and scheduling those notebooks to run every day to compare outputs and share the results with people. It was very tedious and painful. The iteration process was a disaster. You’d translate something, fix the results, and then have to prove to the business that it was the same and stayed the same.
Some tables took us four months to convert a single table. That’s crazy. Some of these were complex foundational tables, like those describing rides, which combined tons of information. But still, we had senior data engineers—top-level data engineers—working on moving a single table for three to six months. It’s just crazy how hard it was.
The primary bottleneck wasn’t just the re-architecture; it was the iteration speed on Hive being so slow.
Datafold: Why was iteration speed such an issue for a migration?
Gleb: If your migration takes too long, you risk the platform you’re moving to becoming legacy before you even finish. It’s a double disaster. We should not have migrated to Hive–it was the absolutely worst thing to migrate to.
It wasn’t so much about the migration process itself but the choice of the target platform. By the time Lyft decided to move to Hive, it was already at the peak of its life cycle. In 2016, Hive was widely adopted, but companies were already moving on to better technologies. Lyft started its migration then, but by the time we finished in 2020, Hive was deep, deep legacy. Nobody was trying to move to Hive anymore—everyone was trying to get off it. And we had just spent four years migrating to this outdated technology.
That terrible iteration speed made the entire process painfully slow. Between the debugging, the manual diff checks, and the platform inefficiencies, everything about the migration was unnecessarily hard.
Migrations as career killers?
What if data migrations could be an opportunity rather than a liability? Gleb envisions a future where automation takes over the tedious, error-prone aspects of migration, freeing engineers to focus on impactful, innovative work. He explains how tools like Datafold are transforming migrations from dreaded career risks into projects that unlock new opportunities for data teams and businesses alike.
Datafold: You mentioned once that data migrations are career killers. That phrase remains stuck in my head. Why did you say that? And do you still believe that today?
Gleb: Yeah, I do still believe it. The problem is that, in a good organization, your career success depends on the impact you make and how visible that impact is to the people who matter in the organization.
If, as a data engineer, you ship pipelines that unblock valuable use cases and power things the company depends on, you’ll get noticed. Your work will get recognized. You can say, "I contributed to this OKR to reduce fraud from 3% to 1%. To do this, I built a very complex pipeline, and it worked." It’s a lot about building new things.
But migration is essentially dealing with technical debt, and that’s usually far less glamorous and recognizable. If I’m a VP of Product and I want to build new things, I don’t really care if the queries run on Snowflake or SQL Server—I just want solutions. Yes, I understand that one architecture is better, but am I happy that my data engineers are spending 60% of their time on a project that doesn’t directly move the needle for the business? Probably not.
And these projects take so long that it takes so much time that it’s easy. Imagine you join a company thinking you’ll work on all these amazing things, and then you spend most of your time on a migration that no one really cares about. It’s very critical work, but people at all-hands meetings are talking about their AI, ML, and cool things they’re building, while you’re just converting SQL code and doing QA. That’s where I was coming from when I called it a career killer.
Also, in my experience, migrations often blow up. They go over budget, they miss deadlines, and they get labeled as failures. When I worked on a migration, I’d sit in meetings with executives, and the vibes were never good. They’d be like, "You’ve been working on moving five tables for four months, and it’s still not done? What’s going on?" It’s hard, you fail, and everyone’s unhappy with you.
Imagine if you’re a data engineer, and you join a company to spend two years on a migration that everyone sees as a failure, that’s a very serious risk for your career. It’s not terrible, but it can definitely slow you down and have a bad impact on your trajectory, even though you’re working very hard into the night to try to move this.
Datafold: Do you still think migrations are a career risk today, given the new generation of tools and the lessons learned from past projects?
Gleb: My vision for Datafold is to automate all the work that doesn’t deserve a data engineer’s time. I don’t think doing a data platform migration should be their job. Of course, there are things they’ll still need to do once the data is converted—like deploying it, communicating with stakeholders, and figuring out how to get it to production—but that’s maybe 1% of all the manual effort. The other 99% can be automated.
There’s absolutely no reason why humans should be doing this by hand. Imagine you’re running a hotel and having your staff wash sheets by hand. Why? It doesn’t make your hotel better. Customers don’t care. All it does is make your team miserable and hurt your margins. The same logic applies here.
Having the right tools can actually help and elevate a data team, because not only are they not wasting time on this project, but they can also get on the infrastructure they need much faster. If you’re on SQL Server and you try to move to Snowflake, there’s a big reason why you’re doing this. It’s probably because you can’t ship AI, ML, or BI products on SQL Server. So the faster you get to Snowflake, the faster your team can unlock the capabilities to actually ship very important data products to the business.
Without automation, that migration might take a year. But with automation, it could take just four or five weeks. Imagine what you could do with that saved time. It’s basically giving you a year’s worth of building. It’s a huge career booster–it’s not just about removing the career risk, but we could actually give teams a boost because now they can just get to building really impactful and exciting cutting edge things in a month from now, instead of two years from now.
Migrations are a huge chunk of the work that we’re automating, and companies are paying us a lot of money for that. But it’s not the only opportunity for automation. There’s so much more we can optimize—like how do you optimize your data, improve its quality, and so on. We’re just getting started. More things are coming that will make our users’ lives better and help our customers achieve way better outcomes with their data. I’ll leave you with that.