This post is part of my “Jinja in dbt” series. If you’re new to Jinja templating, I recommend starting with the introduction to Jinja Templating Engine in dbt to understand the fundamentals before diving into conditional statements.

Conditional logic is useful for creating flexible, environment-aware dbt models. Jinja’s if statements allow you to generate different SQL queries based on variables, target environments, or data conditions at compile time, responding to different environments or schema changes.

This article demonstrates how to use Jinja’s conditional statements to build dynamic dbt models, with examples covering:

Basic If statement syntax

Jinja’s conditional statements use {% if %}, {% elif %}, and {% else %} tags. Each if statement always ends with the {% endif %} tag. This syntax allows you to generate SQL code that varies depending on compile-time conditions.

For example, you dbt model can have the following conditional statements:


select
    customer_id,
    order_date,
    amount,
    {% if var('include_tax', false) %}
    amount * 1.08 as amount_with_tax,
    {% endif %}

    {% if var('environment') == 'production' %}
    'PROD' as environment_flag
    {% else %}
    'DEV' as environment_flag
    {% endif %}
from {{ ref('orders') }}

When include_tax is set to true and environment is set to ‘production’, this compiles to:

select
    customer_id,
    order_date,
    amount,
    amount * 1.08 as amount_with_tax,
    'PROD' as environment_flag
from analytics.dbt_marts.orders

When include_tax is false and environment is ‘dev’, it compiles to:

select
    customer_id,
    order_date,
    amount,
    'DEV' as environment_flag
from analytics.dbt_marts.orders

Environment-based conditional logic

One of the most common use cases for if statements is adapting a model’s behaviour based on the target environment where it is executed. This allows you to optimise for development speed while maintaining production data integrity.

In the example below, the if statement is used to limit the volume of data processed in development vs production environments:


select
    order_id,
    customer_id,
    {% if target.name == 'dev' %}
    -- In development, limit data for faster processing
    order_date,
    amount
from {{ ref('orders') }}
where order_date >= current_date - interval '30 days'
    {% else %}
    -- In production, include all historical data
    order_date,
    amount,
    historical_flag
from {{ ref('orders') }}
    {% endif %}

In development environment, this compiles to:

select
    order_id,
    customer_id,
    -- In development, limit data for faster processing
    order_date,
    amount
from analytics.dbt_marts.orders
where order_date >= current_date - interval '30 days'

In production, it compiles to:

select
    order_id,
    customer_id,
    -- In production, include all historical data
    order_date,
    amount,
    historical_flag
from analytics.dbt_marts.orders

Data quality conditional checks

You can also use conditional statements to implement different data quality rules across environments. This approach allows you to enforce strict validation in production while being more permissive during development.

For example, you might want to remove invalid values from your production data (such as empty or incorrect emails):


select
    customer_id,
    email,
    phone,
    created_date
from {{ ref('raw_customers') }}
where 1=1
    {% if var('strict_validation', false) %}
    and email is not null
    and email like '%@%'
    {% endif %}

    {% if target.name == 'production' %}
    and created_date >= '2020-01-01'
    {% else %}
    and created_date >= current_date - interval '90 days'
    {% endif %}

With strict_validation set to true in production, this compiles to:

select
    customer_id,
    email,
    phone,
    created_date
from analytics.dbt_staging.raw_customers
where 1=1
    and email is not null
    and email like '%@%'
    and created_date >= '2020-01-01'

With strict_validation set to false in development, it compiles to:

select
    customer_id,
    email,
    phone,
    created_date
from analytics.dbt_staging.raw_customers
where 1=1
    and created_date >= current_date - interval '90 days'

Multi-conditional logic

If you need to design a more sophisticated logic with multiple conditions, you should use elif. This pattern is particularly useful when dealing with multiple data sources that require different field mappings.

Consider this dbt model which normalises column names from different data sources:


{% set data_source = var('data_source', 'default') %}

select
    {% if data_source == 'salesforce' %}
    sfdc_id as customer_id,
    account_name as customer_name,
    created_date as registration_date
    {% elif data_source == 'hubspot' %}
    hubspot_id as customer_id,
    company_name as customer_name,
    create_date as registration_date
    {% else %}
    id as customer_id,
    name as customer_name,
    created_at as registration_date
    {% endif %}
from 
    {% if data_source == 'salesforce' %}
    {{ ref('sfdc_accounts') }}
    {% elif data_source == 'hubspot' %}
    {{ ref('hubspot_companies') }}
    {% else %}
    {{ ref('default_customers') }}
    {% endif %}

When data_source is set to ‘salesforce’, this compiles to:

select
    sfdc_id as customer_id,
    account_name as customer_name,
    created_date as registration_date
from analytics.dbt_staging.sfdc_accounts

When data_source is set to ‘hubspot’, it compiles to:

select
    hubspot_id as customer_id,
    company_name as customer_name,
    create_date as registration_date
from analytics.dbt_staging.hubspot_companies

When data_source is set to ‘default’ (or any other value), it compiles to:

select
    id as customer_id,
    name as customer_name,
    created_at as registration_date
from analytics.dbt_staging.default_customers

Select columns based on a condition

If you need to handle PII data or add optional metrics, if statements can help you to dynamically include or exclude columns based on configuration.

In this example we define a condition to include or exclude specific columns:


{% set include_pii = var('include_pii', false) %}
{% set include_analytics = var('include_analytics', true) %}

select
    order_id,
    {% if include_pii %}
    customer_email,
    shipping_address,
    {% endif %}
    product_id,
    quantity,
    amount,
    {% if include_analytics %}
    -- Analytics columns
    case 
        when amount > 1000 then 'high_value'
        when amount > 100 then 'medium_value'
        else 'low_value'
    end as value_tier,
    extract(hour from order_timestamp) as order_hour,
    {% endif %}
    order_date
from {{ ref('raw_orders') }}

With include_pii set to true and include_analytics set to true, this compiles to:

select
    order_id,
    customer_email,
    shipping_address,
    product_id,
    quantity,
    amount,
    -- Analytics columns
    case 
        when amount > 1000 then 'high_value'
        when amount > 100 then 'medium_value'
        else 'low_value'
    end as value_tier,
    extract(hour from order_timestamp) as order_hour,
    order_date
from analytics.dbt_staging.raw_orders

With both variables set to false, it compiles to:

select
    order_id,
    product_id,
    quantity,
    amount,
    order_date
from analytics.dbt_staging.raw_orders

Apply aggregations conditionally

Jinja’s if statements are a great way to build flexible reporting models with different aggregation logic based on parameters.

For example, you can dynamically aggregate the data based on different time periods:


{% set aggregation_level = var('aggregation_level', 'daily') %}

select
    {% if aggregation_level == 'daily' %}
    date_trunc('day', order_date) as period,
    {% elif aggregation_level == 'weekly' %}
    date_trunc('week', order_date) as period,
    {% elif aggregation_level == 'monthly' %}
    date_trunc('month', order_date) as period,
    {% else %}
    date_trunc('year', order_date) as period,
    {% endif %}
    
    customer_id,
    count(*) as order_count,
    sum(amount) as total_amount,
    avg(amount) as avg_amount
    
from {{ ref('orders') }}
group by 1, 2

When aggregation_level is set to ‘weekly’, this compiles to:

select
    date_trunc('week', order_date) as period,
    
    customer_id,
    count(*) as order_count,
    sum(amount) as total_amount,
    avg(amount) as avg_amount
    
from analytics.dbt_marts.orders
group by 1, 2

When aggregation_level is set to ‘monthly’, it compiles to:

select
    date_trunc('month', order_date) as period,
    
    customer_id,
    count(*) as order_count,
    sum(amount) as total_amount,
    avg(amount) as avg_amount
    
from analytics.dbt_marts.orders
group by 1, 2

Design conditional joins

In this last use case we will look at how to use if statements to modify join logic. This pattern is useful when you need to join different tables or change join conditions based on your environment or data source.

In this dbt model, we select fields as well as specify the table to be used in the join based on the variable’s value:


{% set join_customer_data = var('join_customer_data', true) %}
{% set customer_data_source = var('customer_data_source', 'crm') %}

select
    o.order_id,
    o.customer_id,
    o.amount,
    {% if join_customer_data %}
        {% if customer_data_source == 'crm' %}
        c.customer_name,
        c.customer_tier
        {% else %}
        c.full_name as customer_name,
        c.segment as customer_tier
        {% endif %}
    {% endif %}
from {{ ref('orders') }} o
{% if join_customer_data %}
    {% if customer_data_source == 'crm' %}
    left join {{ ref('crm_customers') }} c
    {% else %}
    left join {{ ref('analytics_customers') }} c
    {% endif %}
    on o.customer_id = c.customer_id
{% endif %}

With join_customer_data set to true and customer_data_source set to ‘crm’, this compiles to:

select
    o.order_id,
    o.customer_id,
    o.amount,
    c.customer_name,
    c.customer_tier
from analytics.dbt_marts.orders o
left join analytics.dbt_staging.crm_customers c
on o.customer_id = c.customer_id

With join_customer_data set to false, it compiles to:

select
    o.order_id,
    o.customer_id,
    o.amount
from analytics.dbt_marts.orders o

Best practices for conditional logic

  • When working with conditional statements in Jinja, keep your logic readable by using meaningful variable names and adding comments to explain complex conditions
  • Avoid deeply nested conditionals that become difficult to understand and debug
  • Always test your models with different variable combinations to ensure all conditional branches work correctly
  • Use the dbt compile command to verify that your conditional logic generates the expected SQL for different scenarios before running the models against your data warehouse
  • Set sensible default values for your variables to prevent undefined variable errors and make your models more robust
  • Describe the available variables and their expected values in your model’s documentation
  • When you find yourself writing complex conditional logic repeatedly, extract it into reusable macros

Conclusion

By mastering conditional statements in Jinja, you will be able to create models that intelligently respond to different environments, configurations, and data sources. Start with simple environment-based conditions, then gradually incorporate more sophisticated logic as your dbt project grows in complexity.

Want to see more Jinja use cases in dbt? Checkout other articles in the series:

Illustration by Storyset