Youโ€™ve got 99 problems but data shouldnโ€™t be one
Youโ€™ve got 99 problems but data shouldnโ€™t be one

Youโ€™ve got 99 problems but data shouldnโ€™t be one

Episode publish date
June 27, 2025 4:20 AM (UTC)
Last edit date
Jul 2, 2025 4:31 PM
Last snip date
July 1, 2025 4:16 PM (GMT+1)
Last sync date
July 1, 2025 4:16 PM (GMT+1)
Show

The Stack Overflow Podcast

Snips
12
Warning

โš ๏ธ Any content within the episode information, snip blocks might be updated or overwritten by Snipd in a future sync. Add your edits or additional notes outside these blocks to keep them safe.

โ€ฃ
Episode show notes

Your snips

โ€ฃ

[00:47] Toby Mao's Career Journey

โ€ฃ

[01:53] Iaroslav Zygerman's Early Path

โ€ฃ

[02:57] Netflix's Data Tooling Gap

โ€ฃ

[05:04] Use SQL to Lower Barriers

โ€ฃ

[06:30] Mock SQL Engines for Unit Tests

โ€ฃ

[08:16] Auto-generate Unit Tests From Data

โ€ฃ

[10:17] SQL Evolves Divisively

โ€ฃ

[15:01] Adopt SQL Mesh for Pipelines

โ€ฃ

[18:06] Challenges Managing Playback Data

โ€ฃ

[20:30] Separate Analytics From Production

โ€ฃ

[23:22] AI Increases Data Pipeline Stakes

โ€ฃ

[25:14] Leverage SQL Mesh for Dev Speed

๐—ง๐—ผ๐—ฑ๐—ฎ๐˜†'๐˜€ ๐—ฃ๐—ผ๐—ฑ๐—ฐ๐—ฎ๐˜€๐˜ ๐—œ๐—ป๐˜€๐—ถ๐—ด๐—ต๐˜ ๐—ฅ๐—ฒ๐—ฐ๐—ฎ๐—ฝ๐ŸŽ™๏ธ

Just wrapped up this episode: ๐—ฃ๐—ผ๐—ฑ๐—ฐ๐—ฎ๐˜€๐˜: The Stack Overflow Podcast with Ryan Donovan, Toby Mao, and Iaroslav Zygerman ๐—˜๐—ฝ๐—ถ๐˜€๐—ผ๐—ฑ๐—ฒ: You've got 99 problems but data shouldn't be one ๐——๐—ฎ๐˜๐—ฒ: July 2, 2025

๐—ž๐—ฒ๐˜† ๐—ง๐—ฎ๐—ธ๐—ฒ๐—ฎ๐˜„๐—ฎ๐˜†

Separate your analytics from production databases as you scale. While small datasets (under 100GB) might work in a single system, once you hit terabytes of data, you need specialized data warehousing solutions with clear separation between engineering and analytics functions.

๐—ช๐—ต๐˜† ๐—œ๐˜ ๐— ๐—ฎ๐˜๐˜๐—ฒ๐—ฟ๐˜€

As organizations grow, the skills and priorities of engineers (who optimize for performance and data integrity) diverge from those of analysts (who focus on business insights). This separation becomes critical for maintaining system performance, preventing production database locks, and enabling each team to work efficiently with tools optimized for their specific needs.

๐—ฅ๐—ฒ๐—ณ๐—น๐—ฒ๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐Ÿง 

This reminded me how often I've seen teams try to shortcut proper data architecture only to create technical debt later. With AI now consuming more of our data pipelines, the stakes are even higher โ€“ garbage in truly means 10x garbage out with LLMs. Taking the time to build clean, separate data systems pays dividends when scaling becomes inevitable.

To get the full insight, check out the episode!

#DataScience #Finance #MachineLearning #AI #CareerGrowth #TechLeadership #DecisionMaking #StackOverflowPodcast #99ProblemsButDataShouldntBeOne