Folding data #23
An Interesting Read: Thinking of Analytics Tools as Products
PM: “Can you get this data for me into a dashboard?”
Me, knowing this would be a non-straightforward, manual task: “Ok! What will you use it for?”
PM: “Oh, nothing in particular, I was just curious!”
Me: [sigh.]
Building on her experience as an early member of the Fitbit Data Team, Emily Thompson shares her approach to treating the data as its own product. While the concept isn’t particularly new, she shares some important steps and findings that can showcase just how muddled analytics can get in an organization.
Treating data as the product
Tool of the Week: dbtvault
I spent two years of my life migrating Lyft's data from Redshift to a Trino (Presto) -powered data lake, and if there is one thing I wish I knew before embarking on such a project is that waterfall dimensional modeling (the classical Kimballian approach) simply does not work at the scale, complexity, and evolution speed of modern data platforms. The Data Vault emerged as a solution to the problem: how can you build a well-structured, performant, and auditable data model in a fast-moving organization? Originally proposed by Dan Linstedt and popularized by Kent Graziano of Snowflake, the pattern is getting momentum in the dbt community. dbtvault is a package for dbt that makes implementing Data Vault with dbt easier by automating the generation of boilerplate SQL.
Check out dbtvault on Github
Share your 2021 wins
The data is clear, 2021 is almost over. In fact, we will be sending you our last newsletter of the year next week so that your inbox doesn’t continue to fill while you shift priorities at the end of the year. For our last newsletter, I’d love to hear from you - what were some of your biggest wins of 2021? Whether it’s work and data-related, or you successfully managed to move house without losing a cat, I want to hear it and potentially share it with the community next week. Reply to this email and let me know!
Before You Go