Folding Data #8
When your grandma asks you if she should invest in crypto, you know the Bitcoin bubble is out of control. When the UK government launches a hub on data quality practices ripe with case studies, educational courses, and events, it’s finally time for us all to acknowledge that poor data quality is quickly becoming the #1 bottleneck for data enablement across the entire economy – it's no longer something that only the enormous FAANG data teams should worry about. If you ask me when is the right time to start thinking about implementing data quality practices, I’d say as soon as you have the words “SQL”, “data warehouse” and “analytics” enter your discussions at work.
The Modern Analytics Stack - Open-source Edition
I’m working through updating my guide to the modern data stack for analytics, where I highlighted the best products for your stack. But I know that open-source options might be better for your organization, depending on your goals and priorities. So here’s a guide if you’re planning to build your ideal data stack using only open-source technologies.
📚Show me the open-source stack 👀
An Interesting Read
Small woodland creatures make complicated concepts more manageable in this stunning visualization. If you are fighting your leadership on the idea of upgrading your data ingest to a proper pub/sub architecture, just send them this.
🐀 Follow this gentle intro to Apache Kafka 🐿️
August Data Quality Meetup Speaker Lineup is Finalized
At our meetups, we do a series of quick 7-minute lightning rounds with data leaders from top companies followed by a panel discussion. We’re now thrilled to announce the speaker lineup with the following topics:
- Streaming ML Pipelines with Apache Beam and Google Dataflow
- Leading a thriving millennial analytics organization
- What is activity schema and how it helps with data quality
- Data Drift & Early Monitoring for Machine Learning Models
If you haven’t RSVPed yet, be sure to sign up now 👇
Save my spot on August 26th 🗓️
Before You Go
Are you following Datafold on LinkedIn? We try to share interesting things there that might not make the cut for the newsletter each week.
After our monsters in the data lake joke last newsletter, this one feels like a suitable follow-up. Do you drop tables or turn them into a secret hoard?