Simple explanation on window functions in SQL

You are currently viewing Simple explanation on window functions in SQL
Picture by pixabay
  • Post category:Data analytics
  • Reading time:5 mins read
The article was written by Yuval Marnin.
For data analyst freelance services contact me [email protected]

Explanation of Window Functions

Window functions (also called analytical functions) are an extension of the standard SQL language and can perform calculations on a large number of rows (the calculation itself is performed by an internal and hidden query of the database engine). These functions return a single value that corresponds to each row in the query we wrote.

The functions are called ‘window’ because the window defines the range of rows in the calculation. The window of the function is defined by the Over command.

On the rows within the window, we can define groups by the “Partition BY” command and even specify their order the with “Order By” command. We can also narrow the window range by the commands: rows, preceding, following.

Example:

select

       orders.*,

       sum(sum_order) over () as sum_of_all_orders,

       sum(sum_order) over (partition by customer_id) as sum_of_customer_orders,

       max(sum_order) over (partition by country) as max_order_at_country,

       avg(sum_order) over (partition by country) as avg_order_at_country,

       case when sum_order > avg(sum_order) over (partition by country) then 1 else 0 end as order_is_above_country_avg

from

       orders

Explanation

sum(sum_order) over () as sum_of_all_orders

In this function, there are no values inside the Over command and therefore the size of the window is not limited. Since there is no limit to the size of the window, the calculation of the sum function will refer to all the rows in the table, In other words, the purchase total will appear in all rows.

 sum(sum_order) over (partition by customer_id) as sum_of_customer_orders

This function is similar to the previous one, but in this case, we divided the window into groups of customers according to the Customer_id (in fact, we performed an aggregation operation by Customer_id). Therefore, in each row, we will get the sum of purchases of the customer.

Types of window functions

Aggregate functions (we saw examples in the previous section):

Count, Avg, Sum, Min, Max, etc. – These functions are used for adding aggregate calculations on numbers.

Values

Lag\Lead – Return the value of the previous or next row in the table’s output.

First_value\Last_value – Return the first or last value of the window.

Rank functions

Rank/Row_Number – Used to rank numbers in a specific order.

Percentile – Used to determine percentile values.

Qualify

There are databases that allow using window function filtering. This instruction is called Qualify. In this example, filtering will be performed on all orders of customers whose total purchases are above $1,000.

select

       orders.*,

       sum(sum_order) over (partition by client_id) as sum_of_all_orders

from

       orders

qualify

       sum(sum_order) over (partition by client_id)>1000


This article was written by Yuval Marnin.
If you have a need to hire a freelancer data analyst you may contact me at: [email protected]

Yuval Marnin

For data analytics mentoring services: [email protected]