You’ve got 99 problems but data shouldn’t be one

Episode Link

https://share.snipd.com/episode/261960ba-6968-4cd8-8989-d578d59506d5

Episode publish date

June 27, 2025 4:20 AM (UTC)

Last edit date

Jul 2, 2025 4:31 PM

Last snip date

July 1, 2025 4:16 PM (GMT+1)

Last sync date

July 1, 2025 4:16 PM (GMT+1)

Show

The Stack Overflow Podcast

Show notes link

https://stackoverflow.blog/podcast/

Snips

Warning

⚠️ Any content within the episode information, snip blocks might be updated or overwritten by Snipd in a future sync. Add your edits or additional notes outside these blocks to keep them safe.

‣

Episode show notes

Your snips

‣

[00:47] Toby Mao's Career Journey

‣

[01:53] Iaroslav Zygerman's Early Path

‣

[02:57] Netflix's Data Tooling Gap

‣

[05:04] Use SQL to Lower Barriers

‣

[06:30] Mock SQL Engines for Unit Tests

‣

[08:16] Auto-generate Unit Tests From Data

‣

[10:17] SQL Evolves Divisively

‣

[15:01] Adopt SQL Mesh for Pipelines

‣

[18:06] Challenges Managing Playback Data

‣

[20:30] Separate Analytics From Production

‣

[23:22] AI Increases Data Pipeline Stakes

‣

[25:14] Leverage SQL Mesh for Dev Speed

𝗧𝗼𝗱𝗮𝘆'𝘀 𝗣𝗼𝗱𝗰𝗮𝘀𝘁 𝗜𝗻𝘀𝗶𝗴𝗵𝘁 𝗥𝗲𝗰𝗮𝗽🎙️

Just wrapped up this episode: 𝗣𝗼𝗱𝗰𝗮𝘀𝘁: The Stack Overflow Podcast with Ryan Donovan, Toby Mao, and Iaroslav Zygerman 𝗘𝗽𝗶𝘀𝗼𝗱𝗲: You've got 99 problems but data shouldn't be one 𝗗𝗮𝘁𝗲: July 2, 2025

𝗞𝗲𝘆 𝗧𝗮𝗸𝗲𝗮𝘄𝗮𝘆

Separate your analytics from production databases as you scale. While small datasets (under 100GB) might work in a single system, once you hit terabytes of data, you need specialized data warehousing solutions with clear separation between engineering and analytics functions.

𝗪𝗵𝘆 𝗜𝘁 𝗠𝗮𝘁𝘁𝗲𝗿𝘀

As organizations grow, the skills and priorities of engineers (who optimize for performance and data integrity) diverge from those of analysts (who focus on business insights). This separation becomes critical for maintaining system performance, preventing production database locks, and enabling each team to work efficiently with tools optimized for their specific needs.

𝗥𝗲𝗳𝗹𝗲𝗰𝘁𝗶𝗼𝗻 🧠

This reminded me how often I've seen teams try to shortcut proper data architecture only to create technical debt later. With AI now consuming more of our data pipelines, the stakes are even higher – garbage in truly means 10x garbage out with LLMs. Taking the time to build clean, separate data systems pays dividends when scaling becomes inevitable.

To get the full insight, check out the episode!

#DataScience #Finance #MachineLearning #AI #CareerGrowth #TechLeadership #DecisionMaking #StackOverflowPodcast #99ProblemsButDataShouldntBeOne