The Databricks Data Engineer

Podcast by Jakub Lasak

English

Technology & science

Limited Offer

2 months for 19 kr.

Then 99 kr. / monthCancel anytime.

20 hours of audiobooks / month
Podcasts only on Podimo
All free podcasts

Get Started

About The Databricks Data Engineer

Helping 18k+ Databricks data engineers become seniors: interview like seniors, execute like seniors, think like seniors.

All episodes

8 episodes

Liquid Clustering vs Z-Ordering: 4 questions that decide

You open your Databricks workspace. Two Delta tables. Same size, same downstream BI workload. Table A was partitioned and z-ordered in 2023, runs fine. Table B is greenfield this quarter, liquid clustering by default. Your tech lead asks how aggressive you want to be with migration tickets. Whatever you type back is probably wrong. This is not a feature swap. It's a paradigm shift, and the migration math only makes sense once you can name what actually moved underneath you. Migrate-everything is wrong. Migrate-nothing is wrong. The right answer is per-table, with named criteria. In this episode: - What actually changed when liquid clustering shipped, and the one phrase that simplifies every migration debate you'll have for the next two years - The four-question filter to run table by table, in order, before you commit to a layout decision - The surviving cases where the old paradigm still wins, including the one the evangelism crowd never names - Why liquid clustering and partitioning on a Delta table are mutually exclusive, and the operational property you give up if you migrate the wrong tables - The named audit that turns six hundred legacy tables into three buckets in an afternoon - What kind of senior engineer your tech lead remembers when the promotion conversation happens This episode is for Databricks data engineers staring at a migration backlog, defending a greenfield default, or trying to explain to a platform team why some tables shouldn't be touched. Whether you're a mid-level engineer running your first migration, or a senior engineer setting the standard for the next two years of greenfield Delta tables, you'll walk away with a defended per-table answer and the vocabulary to back it up. --- Helping 18,000+ Databricks data engineers become seniors: interview like seniors, execute like seniors, think like seniors. Follow The Databricks Data Engineer for new episodes every Monday, Wednesday, and Friday. LinkedIn: linkedin.com/in/jrlasak Newsletter: dataengineer.wiki #DataEngineering #Databricks #DataEngineer #CareerGrowth #ApacheSpark #DeltaLake

18 May 2026 - 18 min

The compounding curve: why some Databricks engineers' salaries grow 5x faster than others

Year one. Two new juniors join the same Databricks platform org. Same starting salary, same skills, same desk. Year three, five thousand bucks apart. Year eight, household-car-and-a-half apart. Every year. Forever. Both worked hard. Both stayed technical. Both got positive reviews. Neither did anything wrong. So what happened? Salary in this field isn't one curve. It's two that look identical for the first three years, then peel apart. The choice between them gets made on a handful of small Tuesdays most engineers don't even remember. In this episode: - Why skill is the floor and leverage is the ceiling, and why the better technician is often the worse-paid engineer - The four small Tuesday choices that decide which curve a Databricks data engineer walks up - The difference between expanding what you ship and expanding what you own, and why your manager only fights for one of them - How a junior with twelve hours of writing across four years out-leveraged engineers with twice her tenure - The compass question to run on every career fork before the curve runs you This episode is for Databricks data engineers who suspect their salary trajectory isn't matching their effort, and who want to know what the highest-paid engineers on their team are doing differently. Whether you're a mid-level wondering why peers at the same level make fifty grand more, or a senior trying to understand why your raises keep shrinking, you'll walk away with a four-part audit you can run on your last six months and your next decision. --- Helping 18,000+ Databricks data engineers become seniors: interview like seniors, execute like seniors, think like seniors. Follow The Databricks Data Engineer for new episodes every Monday, Wednesday, and Friday. LinkedIn: linkedin.com/in/jakublasak Newsletter: dataengineer.wiki #DataEngineering #Databricks #DataEngineer #CareerGrowth #ApacheSpark #DeltaLake

11 May 2026 - 22 min

The 90/9/1 rule of Databricks performance work - how to triage Spark optimization in 60 seconds

Your team is three weeks into a Databricks performance push. Broadcast hints in PRs. AQE flags toggled like christmas lights. Partition counts re-tuned for the third time. The manager is asking, gently, when the gains are showing up in the bill. The staff DE on the next team finished theirs in two afternoons. Same workloads, bigger drop. They were running a triage you have never been taught. In this episode: - Why most of what your team calls Spark optimization is cosmetic and will never move the bill, no matter how clean the PR - The two named tests senior Databricks engineers run on every workload before they touch a config - Why the same change (caching, salted joins, skew handling) can be cosmetic on one workload and structural on the one next to it - Where the real leverage in a Spark workload actually lives, and why it is almost always visible from outside the code For Databricks data engineers stuck in a performance push that is not converting effort into runtime or bill drops. Whether you are mid-level drowning in config tweaks, or senior watching the bill refuse to move, you will walk away with a one-minute triage you can run on any Spark workload tomorrow morning. --- Helping 18,000+ Databricks data engineers become seniors: interview like seniors, execute like seniors, think like seniors. Follow The Databricks Data Engineer for new episodes every Monday, Wednesday, and Friday. LinkedIn: linkedin.com/in/jakublasak Newsletter: dataengineer.wiki #DataEngineering #Databricks #DataEngineer #CareerGrowth #ApacheSpark #DeltaLake

4 May 2026 - 17 min

The Databricks data engineer in 2026 - the four shifts that just changed your job

You scroll past the cancelled junior req, the "serverless first" line on your director's planning slide, and the third Lakebase mention from your Databricks rep this quarter. Each one looks like a news item. None of them feel like they're about you. They are. Four structural shifts have already happened in the field, and the words "Databricks data engineer" don't mean what they meant in 2024. Most engineers haven't named them out loud yet, which is why their next promotion packet is going to read a year out of date. In this episode: - Why the junior hiring pipeline didn't pause - it closed, and what that does to mid-level reqs - How serverless quietly turned cost discipline into the new performance tuning, and why your manager wants it in your promo packet - Where Unity Catalog fluency crossed from "nice differentiator" to "you get filtered in the screen without it" - What the data engineering and backend convergence (Lakebase, serving layers, operational reads on the lakehouse) opens up for engineers who move first - The diagnostic question to ask yourself about the skill you're betting your next two years on This episode is for Databricks data engineers planning their 2026, whether you're a senior wondering where your value is moving, a mid-level engineer trying to pick the right thing to learn next, or a junior staring at a hiring market that doesn't look like the one you trained for. You'll walk away with a four-part map of the field and a concrete next move for your career segment. --- Helping 18,000+ Databricks data engineers become seniors: interview like seniors, execute like seniors, think like seniors. Follow The Databricks Data Engineer for new episodes every Monday, Wednesday, and Friday. LinkedIn: linkedin.com/in/jakublasak Newsletter: dataengineer.wiki #DataEngineering #Databricks #DataEngineer #CareerGrowth #ApacheSpark #DeltaLake #UnityCatalog #Lakebase

27 Apr 2026 - 18 min

9 Behaviors Quietly Killing Your Promotion To Senior Databricks Data Engineer

Mid-level is a down escalator. It looks like flat ground. You feel productive, your tickets close on Friday, your burndown chart is healthy, and your review says "reliable executor of well-defined work" for the third cycle in a row. That sentence is the official label for "not getting promoted this year" - and most Databricks data engineers never decode it. It isn't a skill gap. It's nine habits that each feel like professionalism, compound against you across review cycles, and separate the engineer up for staff next quarter from the engineer still stuck at mid three years from now. In this episode: - Why the ambiguous ticket nobody wants is the senior-engineer starter pack, and the clean ticket is the trap - How silent 2 a.m. pipeline fixes disappear from your promotion packet, and what to post the next morning instead - The difference between how mid-level and senior Databricks engineers spend a Tuesday afternoon - Why mid-level excellence is senior mediocrity at the same quality bar - The one ceiling-breaker behavior that predicts mid-to-senior promotion more reliably than tenure, tech depth, or luck This episode is for Databricks data engineers who ship solid work, get "solid performer" reviews, and can't name why they're still at mid. Whether you're one cycle in and want to avoid the trap, or three cycles in and wondering what went wrong, you'll walk away with a named taxonomy to audit your last six months against and one concrete move to run this Thursday afternoon. --- Helping 18,000+ Databricks data engineers become seniors: interview like seniors, execute like seniors, think like seniors. Follow The Databricks Data Engineer for new episodes every Monday, Wednesday, and Friday. LinkedIn: linkedin.com/in/jakublasak Newsletter: dataengineer.wiki #DataEngineering #Databricks #DataEngineer #CareerGrowth #ApacheSpark #DeltaLake

20 Apr 2026 - 14 min

En fantastisk app med et enormt stort udvalg af spændende podcasts. Podimo formår virkelig at lave godt indhold, der takler de lidt mere svære emner. At der så også er lydbøger oveni til en billig pris, gør at det er blevet min favorit app.

Rigtig god tjeneste med gode eksklusive podcasts og derudover et kæmpe udvalg af podcasts og lydbøger. Kan varmt anbefales, om ikke andet så udelukkende pga Dårligdommerne, Klovn podcast, Hakkedrengene og Han duo 😁 👍

Podimo er blevet uundværlig! Til lange bilture, hverdagen, rengøringen og i det hele taget, når man trænger til lidt adspredelse.

Choose your subscription