How to Use Jinja's For Loops in dbt Models
This article is part of the “Jinja in dbt” series. If you’re new to Jinja templating, start with What is Jinja Templating Engine and How to Use It in dbt.
Jinja’s for loops are an important concept to understand if you want to create maintainable and scalable dbt models. They embody the “Don’t Repeat Yourself” (DRY) principle by allowing you to generate SQL code by iterating over lists or arrays. This powerful feature eliminates the manual coding of repetitive patterns and makes your dbt models more resilient to change.
This guide explores the following use cases for Jinja for loops in dbt models:
- The basic syntax of a for loop and how to handle trailing commas
- Iterating over lists to apply repeating transformations
- Iterating over dictionaries for tasks like renaming columns
- Using conditional logic within loops and leveraging advanced loop variables
- Generating dynamic UNION statements
- Best practices for writing effective and readable loops
Basic For loop syntax
Jinja for loops use {% for %} and {% endfor %} tags to iterate over lists, dictionaries, or other iterable objects. The most common use case is iterating over a list of strings to generate SQL.
You dbt model can look like this:
{% set columns = ['customer_id', 'order_date', 'amount', 'status'] %}
select
{% for column in columns %}
{{ column }}{% if not loop.last %},{% endif %}
{% endfor %}
from {{ ref('orders') }}
This compiles to clean, properly formatted SQL:
select
customer_id,
order_date,
amount,
status
from analytics.dbt_staging.orders
In this example, the loop iterates through the columns list set before the SELECT statement. The {{ column }} expression outputs the current item from the columns list. The {% if not loop.last %},{% endif %} is a crucial pattern that adds a comma after each column name except for the last one, ensuring valid SQL syntax.
Iterating over column lists
For loops are perfect for applying the same transformation to multiple columns programmatically, reducing code duplication and manual maintenance.
Here is an example of a dbt model with two sets of fields based on their data type:
{% set numeric_columns = ['revenue', 'cost', 'profit', 'tax_amount'] %}
{% set text_columns = ['customer_name', 'product_name', 'description'] %}
select
order_id,
-- Clean numeric columns by replacing nulls with 0
{% for col in numeric_columns %}
coalesce({{ col }}, 0) as {{ col }}_clean,
{% endfor %}
-- Clean text columns by trimming and converting to uppercase
{% for col in text_columns %}
trim(upper({{ col }})) as {{ col }}_clean{% if not loop.last %},{% endif %}
{% endfor %}
from {{ ref('raw_orders') }}
When compiled, the query will look as follows:
select
order_id,
-- Clean numeric columns by replacing nulls with 0
coalesce(revenue, 0) as revenue_clean,
coalesce(cost, 0) as cost_clean,
coalesce(profit, 0) as profit_clean,
coalesce(tax_amount, 0) as tax_amount_clean,
-- Clean text columns by trimming and converting to uppercase
trim(upper(customer_name)) as customer_name_clean,
trim(upper(product_name)) as product_name_clean,
trim(upper(description)) as description_clean
from analytics.dbt_staging.raw_orders
Creating repeating calculations
You can use Jinja’s for loops to dynamically generate complex aggregations or pivots without writing dozens of repetitive CASE WHEN statements. This is particularly useful for creating summary reports or dashboards.
For example, you need to create several calculations with similar logic that represent revenue value for each region in the regions set:
{% set regions = ['north', 'south', 'east', 'west'] %}
select
date_trunc('month', order_date) as reporting_month,
{% for region in regions %}
sum(case when region = '{{ region }}' then revenue else 0 end) as {{ region }}_revenue,
count(distinct case when region = '{{ region }}' then customer_id else null end) as {{ region }}_customers{% if not loop.last %},{% endif %}
{% endfor %}
from {{ ref('regional_sales') }}
group by 1
This compiles to a comprehensive query:
select
date_trunc('month', order_date) as reporting_month,
sum(case when region = 'north' then revenue else 0 end) as north_revenue,
count(distinct case when region = 'north' then customer_id else null end) as north_customers,
sum(case when region = 'south' then revenue else 0 end) as south_revenue,
count(distinct case when region = 'south' then customer_id else null end) as south_customers,
sum(case when region = 'east' then revenue else 0 end) as east_revenue,
count(distinct case when region = 'east' then customer_id else null end) as east_customers,
sum(case when region = 'west' then revenue else 0 end) as west_revenue,
count(distinct case when region = 'west' then customer_id else null end) as west_customers
from analytics.dbt_staging.regional_sales
group by 1
Iterating over dictionaries
Use dictionaries when you need to work with key-value pairs, for example when renaming columns from a source system or applying different transformations to different columns.
In this dbt model the column_mappings set defines old and new column names:
{% set column_mappings = {
'customer_id': 'cust_id',
'order_date': 'purchase_date',
'total_amount': 'amount',
'order_status': 'status'
} %}
select
{% for new_name, old_name in column_mappings.items() %}
{{ old_name }} as {{ new_name }}{% if not loop.last %},{% endif %}
{% endfor %}
from {{ ref('legacy_orders') }}
This compiles to a clean query with updated column names:
select
cust_id as customer_id,
purchase_date as order_date,
amount as total_amount,
status as order_status
from analytics.dbt_staging.legacy_orders
Conditional logic within For loops
You can combine conditional statements with for loops for more sophisticated query generation, such as masking sensitive data based on environment variables.
Here is a dbt model that specifies the set of columns that should be masked:
{% set all_columns = adapter.get_columns(ref('customer_data')) %}
{% set sensitive_columns = ['ssn', 'credit_card_number', 'phone_number'] %}
select
{% for column in all_columns %}
{% if column.name in sensitive_columns and var('mask_pii', true) %}
'***MASKED***' as {{ column.name }}
{% else %}
{{ column.name }}
{% endif %}
{%- if not loop.last -%},{% endif %}
{% endfor %}
from {{ ref('customer_data') }}
Assuming that your table has columns customer_id, name, ssn, and email; and that mask_pii is true, this compiles to:
select
customer_id,
name,
'***MASKED***' as ssn,
email
from analytics.dbt_staging.customer_data
When mask_pii is false, it compiles to:
select
customer_id,
name,
ssn,
email
from analytics.dbt_staging.customer_data
Advanced loop variables
Jinja provides several useful variables within a for loop that you can use to get the context about the current iteration:
- loop.index: The current iteration of the loop (1-indexed)
- loop.index0: The current iteration of the loop (0-indexed)
- loop.first: True if this is the first iteration
- loop.last: True if this is the last iteration
- loop.length: The total number of items in the sequence
Let’s see how you can use them. This dbt model uses Jinja to dynamically generate calculations for each event type in the event_types list, based on each list’s element position:
{% set event_types = ['page_view', 'add_to_cart', 'purchase', 'sign_up'] %}
select
user_id,
session_id,
{% for event in event_types %}
-- Iteration {{ loop.index }} of {{ loop.length }}
count(case when event_type = '{{ event }}' then 1 end) as {{ event }}_count
{%- if not loop.last %},{% endif %}
{% endfor %}
from {{ ref('raw_events') }}
group by 1, 2
This compiles to:
select
user_id,
session_id,
-- Iteration 1 of 4
count(case when event_type = 'page_view' then 1 end) as page_view_count,
-- Iteration 2 of 4
count(case when event_type = 'add_to_cart' then 1 end) as add_to_cart_count,
-- Iteration 3 of 4
count(case when event_type = 'purchase' then 1 end) as purchase_count,
-- Iteration 4 of 4
count(case when event_type = 'sign_up' then 1 end) as sign_up_count
from analytics.dbt_staging.raw_events
group by 1, 2
Generating dynamic UNION statements
For loops are particularly powerful for generating UNION statements across multiple similar tables.
For example, in this dbt model we need to union tables for each quarter:
{% set quarters = ['2023_q1', '2023_q2', '2023_q3', '2023_q4'] %}
{% for quarter in quarters %}
select
'{{ quarter }}' as quarter,
customer_id,
sum(amount) as total_amount
from {{ ref('sales_' ~ quarter) }}
group by customer_id
{% if not loop.last %}
union all
{% endif %}
{% endfor %}
This compiles to a comprehensive UNION query:
select
'2023_q1' as quarter,
customer_id,
sum(amount) as total_amount
from analytics.dbt_staging.sales_2023_q1
group by customer_id
union all
select
'2023_q2' as quarter,
customer_id,
sum(amount) as total_amount
from analytics.dbt_staging.sales_2023_q2
group by customer_id
union all
select
'2023_q3' as quarter,
customer_id,
sum(amount) as total_amount
from analytics.dbt_staging.sales_2023_q3
group by customer_id
union all
select
'2023_q4' as quarter,
customer_id,
sum(amount) as total_amount
from analytics.dbt_staging.sales_2023_q4
group by customer_id
Best practices
- Start with basic for loops and gradually add complexity as needed
- Use meaningful names: for column in columns_to_clean is much clearer than for x in my_list
- Always run the dbt compile command first to inspect the generated SQL. This is the best way to catch syntax errors like misplaced commas before running your model
- Consider what happens when your lists are empty or contain unexpected values, and add default values to handle such edge cases
- Use Jinja comments {# #} to explain what complex loops are doing. Indent your Jinja code to reflect its structure
Conclusion
For loops are fundamental to leveraging the full power of Jinja in dbt. By mastering them, you can automate repetitive SQL tasks and create scalable data transformation pipelines.
Want to see more Jinja use cases in dbt? Check out other articles in the series:
- What is Jinja templating engine and how to use it in dbt
- How to use Jinja variables in dbt models
- How to use Jinja’s conditional if statements in dbt models
- How to use Jinja’s for loops in dbt models
- How to dynamically select columns in dbt models (coming soon)
- How to debug Jinja statements in dbt models (coming soon)
Illustration by Storyset