Archive.

001Jun 19, 2026

9 min

The Three Joins AQE Can't Save You From

When and how to override Spark's automatic join strategy — a technical deep-dive for senior engineers.

001

002Jun 19, 2026

4 min

A Decision Guide to Databricks Compute: Matching Your Workload to the Right Cluster

I match compute type to workload pattern — the biggest cost mistake I see is teams using all-purpose clusters for production jobs because that's what they developed on.

101

003Jun 19, 2026

12 min

Databricks "Best Practices" That Are Actually Outdated in 2026

Stop following 2023 advice in 2026. Half of what's on Stack Overflow and old blog posts is fighting the platform instead of using it.

012

004Jun 02, 2026

11 min

I Built a 100-Match Soccer Analytics Dashboard in a Weekend. No Build Tools.

No npm. No Vite. No bundler. Just HTML, vanilla JavaScript, D3, and free StatsBomb data. Here's how, and what I picked up about xG, xT, and 2010-era web tech.

104

005Jun 01, 2026

9 min

Spark Just Got Real: Benchmarking the New Real-Time Mode Against Micro-Batch

Benchmark of the new feature in spark 4.1 which changes completely how spark deal with streaming data

203