Request a 30-minute demo

Our product expert will guide you through our demo to show you how to automate testing for every part of your workflow.

See data diffing in real time
Data stack integration
Discuss pricing and features
Get answers to all your questions
Submit your credentials
Schedule date and time
for the demo
Get a 30-minute demo
and see datafold in action
August 1, 2023
Modern Data Stack

Buy not build

Put a comma in the right place

No items found.
Gleb Mezhanskiy

Which comma do you erase on the whiteboard above?

This dilemma is very common in engineering, and can be particularly challenging for choice in data infrastructure.

One of the costliest mistakes a team can make is attempting to build something in-house that can be bought off the shelf: databases, BI tools, data integration connectors – you name it. If tempted (or pushed by your Engineering partners to do so) – think:

How confident are you that you can do better than a vendor with a strong product vision and a solid technical team who has been working heads-down on the problem for 3-7 years?

Are you willing to wait X number of months until you have the team in place to build it and the solution produced is reliable enough?

Would the internal team create more value for your business re-creating something that your competitors buy off-the-shelf versus if they worked on your actual customer-facing product?

“Paying $50K for a tool to pump data between databases?!” But the problem is that it’s not that intuitive to take all expenses into account:

  • Cost of labor, including hiring, management & other overhead (multiply the salary by 2.7)
  • Opportunity cost of time & resources – the cost of not building something else your business needs

The pressure from Engineering to build data infrastructure can be immense (it’s interesting, it’s a hard problem, you can open-source it etc.). My experience says: buy what can be purchased and get your engineers excited about building higher-value-add products such as building user-facing features and automating decision making with ML – something that really differentiates your business.

Counterintuitively, “throwing money into the problem” – buying tools and infrastructure from a vendor – is almost always ultimately a cheaper, faster, and often a more effective solution than building it in-house.

As Nelson Auner neatly puts it in Building Analytical stack in 2020 in the context of buying vs. building data integration solutions:

From the engineering side, you may get “Don’t waste money - we could do this ourselves, it is so easy”. Be prepared to ask if any of your well-intentioned teammates:
  1. Have actually implemented and maintained a data pipeline for several years
  2. Are personally volunteering to do so, for you [the internal customer], in a timely manner
  3. Are excited to be on-call 24⁄7 to fix issues

That is not to say you can’t shoot yourself in the foot by picking a bad vendor: there are plenty of ineffective and expensive solutions with aggressive sales teams out there, so choose wisely.

And stay tuned for the upcoming posts about frameworks for choosing data infrastructure vendors!


In this article