dbt PR Review — real-world issue corpus

A living, sourced corpus of real dbt/SQL pitfalls practitioners report (dbt Discourse/Slack, Reddit, StackOverflow, engineering blogs), each turned into a reproducible test. Every entry is a problem someone actually hit — not a synthetic example — and the reviewer is held to catching it with zero false positives on the correct version.

Reproduce: bun run --conditions=browser script/review-realworld-eval.ts (detection + precision) and script/review-dialect-coverage.ts (10-dialect coverage). Verified end-to-end on a real third-party repo (jaffle_shop, DuckDB).

Latest measured results

Caught: 15/15 sourced pitfalls
False positives on the correct version: 0/5
Dialect coverage (10 dialects): function precision 100%, recall 95.7%, structural/type 100%
Real repo (jaffle_shop PR): every finding tied to the change; no noise about pre-existing code (diff-scoped)

The corpus

#	Pitfall	Where practitioners report it	What the reviewer emits
1	LEFT JOIN silently becomes INNER — a filter on the right table in `WHERE` drops the unmatched rows	sqlbenjamin.wordpress.com "LEFT JOINs and WHERE clauses", SQLShack, Toad forum	`join_risk` (critical) — move the predicate to the `ON` clause
2	Fan-out — one-to-many join inflates `SUM`/`COUNT`	dbt join-logic docs, Holistics fan-out docs	`fanout` / `join_risk` — aggregate before joining, or `count(distinct …)`
3	`NOT IN (subquery)` with NULLs returns no rows	classic SQL gotcha (StackOverflow)	`sql_correctness` — use `NOT EXISTS`
4	Incremental model with no `is_incremental()` guard → full reprocess / dupes	dbt incremental-models docs	`materialization`
5	Dedup `row_number()` with no `ORDER BY` → which row survives flaps between runs	dbt "remove partial duplicates" blog	`dedup` (warning)
6	Clock (`current_timestamp`/`getdate`) baked into a transform → non-idempotent	dbt Slack / idempotency guidance	`idempotency`
7	*`SELECT ` in a mart** → breaks downstream on upstream schema change; scan cost	dbt style guides	`warehouse_cost` / `sql_quality`
8	`= NULL` instead of `IS NULL` → always false	classic SQL gotcha	`sql_correctness`
9	Division with no zero-guard → divide-by-zero failures	dbt Slack (`safe_divide`/`nullif`)	`sql_correctness` (core L032)
10	Non-portable function for the project's dialect (e.g. `NVL` on BigQuery)	cross-warehouse migration pain (dbt Discourse, SQLGlot)	`sql_quality` (core L033, dialect-aware)
11	Comma / implicit cross join	SQL joins tutorials, cartesian-product warnings	`join_risk`
12	Unguarded `COUNT(DISTINCT)` at scale → cost	BigQuery/Snowflake cost threads (`approx_count_distinct`)	`warehouse_cost`
13	`BETWEEN` on a timestamp drops the last day's afternoon (inclusive upper bound → `00:00:00`)	SO "Exclude rows with certain time of day"	`sql_correctness` — use half-open `>= / <`
14	String `\\|\\|` concat NULL-propagation — any NULL operand → whole result NULL	Baeldung "Concatenate with NULL Values in SQL"	`sql_correctness` — use `concat_ws`/`coalesce`
15	Hand-rolled surrogate key over raw concat — NULL field nulls the key / NULL-vs-`''` collisions	dbt-utils #488, dbt Discourse surrogate-key	`sql_correctness` / `dedup`

Plus the incremental unique_key with NULL components → duplicate rows class (dbt Discourse #17298, dbt-core #7597) is covered by the incremental + surrogate-key detectors.

Precision — the correct versions stay silent

The corpus pairs each pitfall with its fix and asserts no finding: a LEFT JOIN anti-join (WHERE right.key IS NULL), nullif-guarded division, NOT EXISTS, row_number() with an ORDER BY, and a dialect's native function (e.g. NVL on Snowflake). Several real false positives were found and removed via this corpus (blanket SAFE_CAST nag, ARRAY_AGG-without-WITHIN GROUP on the wrong dialect, ORDER BY DESC without NULLS LAST, and OLTP/index rules like correlated-subquery / function-on-filter that don't apply to columnar warehouses).

The bar: a first-time user opening a PR should see findings that map exactly to what they changed, are correct for their warehouse's dialect, and never nag about code they didn't touch.