Skip to content

Lineage Tools

lineage_check

Trace column-level lineage through SQL transformations. Returns source-to-target mappings for every column.

> lineage_check "
    SELECT
        o.order_id,
        c.name AS customer_name,
        o.amount * t.rate AS amount_usd,
        DATE_TRUNC('month', o.order_date) AS order_month
    FROM orders o
    JOIN customers c ON o.customer_id = c.id
    JOIN fx_rates t ON o.currency = t.currency AND o.order_date = t.rate_date
  "

Column Lineage:
┌─────────────────┬────────────────────────┬───────────────┐
│ Target Column   │ Source                 │ Transform     │
├─────────────────┼────────────────────────┼───────────────┤
│ order_id        │ orders.order_id        │ direct        │
│ customer_name   │ customers.name         │ alias         │
│ amount_usd      │ orders.amount          │ arithmetic    │
│                 │ fx_rates.rate          │ arithmetic    │
│ order_month     │ orders.order_date      │ DATE_TRUNC    │
└─────────────────┴────────────────────────┴───────────────┘

Confidence: high
Confidence factors: schema-qualified tables, no SELECT *, small graph (7 nodes)

Confidence signals

Lineage confidence is affected by 4 factors:

Factor Impact Why
SELECT * Reduces to medium Cannot determine which columns flow through
Jinja templates Reduces to medium Dynamic SQL not fully parseable
Missing schema qualification Reduces to low Ambiguous table references
Large graph (50+ nodes) Reduces to medium Complex lineage may miss indirect paths

Use cases

Impact analysis before schema changes:

> lineage_check "SELECT * FROM stg_orders"

Warning: SELECT * detected — lineage confidence reduced to medium.
All columns from stg_orders flow through.

Recommendation: Use explicit column list for precise lineage.

Verify lineage preservation during migration:

> lineage_check [original Snowflake SQL]
> lineage_check [translated BigQuery SQL]

Compare: ✓ Same 4 source columns map to same 4 target columns
Lineage preserved across translation.

dbt model dependency tracking:

> dbt_manifest ./target/manifest.json

Models: 47
Sources: 12

> lineage_check [fct_revenue.sql]

Source columns:
  stg_orders.order_amount → fct_revenue.revenue
  stg_orders.order_date → fct_revenue.revenue_date
  dim_products.category → fct_revenue.product_category

Downstream impact: dim_products.category rename would break fct_revenue