If you’ve been writing SQL for a while, you’ve probably used aggregate functions like SUM, AVG, or COUNT to get summary statistics. But then someone mentions window functions and you wonder why use those if I can already aggregate my data? If this question ever crossed your mind this article is for you. We will go through the similarities, the differences and some examples.

Note: I will be using the Snowflake TPCH_SF1 sample data for the examples below so you can recreate the results.

Although they look similar, aggregate functions and window functions behave very differently, and knowing when to use each will save you from writing overly complex queries.

Aggregate Functions: The Blender Link to this heading

Aggregate functions, such as SUM, AVG, COUNT, MAX, and MIN, group rows together and return a single value per group. They help answer questions like what is the average order price per region? Or how many orders each customer makes per Q?

If the answer to your question is a single aggregated value in general or per group then an aggregate function is the way to go.

You can think of it as a blender, you give it multiple data points and it summarises it down to one result that represents all the input.

Example:

sql
1SELECT 
2    O_CUSTKEY, 
3    AVG(O_TOTALPRICE) AS AVERAGE_ORDER 
4FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS 
5GROUP BY O_CUSTKEY; 

Results Snippet:

O_CUSTKEY AVERAGE_ORDER
1910 129346.8600000000
74321 166403.0200000000
22052 183089.4550000000
86840 243148.4400000000
40787 164023.1266666667
141908 256498.7666666667
18455 137463.4500000000
81878 118406.8150000000
48365 196978.3700000000
86269 137881.3233333333

As you can see, this makes aggregate functions the best solution when you need a quick overview of your data without going into details. We can see the average order value for each customer, but we have no idea about the details that went into that number.

This brings us to where aggregate functions lack behind. Now if we go back to our blender example, if I were to give you a smoothie and ask you to compare its taste to each ingredient that went into making it that would be near impossible if you don’t know the ingredients to start with right? That’s where window functions come to use.

Window Functions: The Chopper Link to this heading

Same as aggregate functions, window functions perform calculations across rows, but they leave the rows intact. Let’s say you want to compare customers’ order values to the average order value. You will need to keep all the rows containing the order values and add a column containing the overall average.

This is exactly what a window function does. Instead of GROUP BY, it used an OVER clause to know which groups to aggregate on while keeping the original rows.

Now going back to the kitchen, you can think of it as a chopper you use to make a salad. You do get the main result of combining all your ingredients but still you can distinguish them apart.

Example:

sql
1SELECT 
2    O_CUSTKEY, 
3    O_TOTALPRICE, 
4    AVG(O_TOTALPRICE) OVER(PARTITION BY O_CUSTKEY) AS AVERAGE_ORDER 
5FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS 
6ORDER BY O_CUSTKEY; 

Results Snippet:

O_CUSTKEY O_TOTALPRICE AVERAGE_ORDER
1 123076.84 97960.48500
1 74602.81 97960.48500
1 174645.94 97960.48500
1 95911.01 97960.48500
1 54048.26 97960.48500
1 65478.05 97960.48500
2 167016.61 146896.20428
2 174291.41 146896.20428
2 221397.35 146896.20428
2 33082.83 146896.20428

Now, we can see each individual order price along the average for that customer. This allows us to do deeper analyses and comparisons.

One Final Note: Filtering Link to this heading

Now it is important to note that filtering for each type of those functions is different.

For aggregate functions we use a HAVING clause while repeating the function expression:

sql
1SELECT 
2    O_CUSTKEY, 
3    AVG(O_TOTALPRICE) AS AVERAGE_ORDER 
4FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS 
5GROUP BY O_CUSTKEY 
6HAVING AVG(O_TOTALPRICE) > 10000; 

And for window functions we can use a CTE or, when using snowflake, we can use a QUALIFY clause:

CTE

sql
 1WITH ORDER_AVG AS ( 
 2    SELECT 
 3        O_CUSTKEY, 
 4        O_TOTALPRICE, 
 5        AVG(O_TOTALPRICE) OVER (PARTITION BY O_CUSTKEY) AS AVERAGE_ORDER 
 6    FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS 
 7) 
 8SELECT * 
 9FROM order_avg 
10WHERE AVERAGE_ORDER > 150000 
11ORDER BY O_CUSTKEY; 

QUALIFY

sql
1SELECT 
2    O_CUSTKEY, 
3    O_TOTALPRICE, 
4    AVG(O_TOTALPRICE) OVER (PARTITION BY O_CUSTKEY) AS AVERAGE_ORDER 
5FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS 
6QUALIFY AVERAGE_ORDER > 150000 
7ORDER BY O_CUSTKEY; 

Wrapping Up: Knowing Which Tool to Use Link to this heading

Aggregate and Window functions are one of the most used tools in querying data, and understanding when to use each helps us write more efficient and readable queries.

So next time you’re writing SQL and trying to decide between GROUP BY and OVER, just ask yourself: Do I want a smoothie or a salad?