For data analyst freelance services contact me [email protected]
Introduction
This post is the second post of a series in which I will describe common analytical SQL queries that a data analyst is doing on a daily basis. In this chapter I will focus on analytical implementation that can be executed in SQL using the Rank function.
The last chapter was focused on analytical implementation you can do on SQL with group by clause.
Ranking Customers in SQL with Rank Function
The most basic operation that can be performed with the Rank function is offcourse to the rows in a table.
In the next query we will show how we will rank the customers (customers table) according to the purchase amount.
Query:
Results
Explain
The customer ranking by purchase amount (the sum_orders field) can be seen in the rank_orders variable. I also sorted the table by this variable.
Because the rank function is a window function, it allows us to determine the order and grouping variable before running. By doing so, I can rank customers within their countries.
I did such a calculation in the rank_orders_by_country field. The rank of each customer in this field is determined by its rank within the country.
For example, the customer in the second row received rank 2 in the overall ranking, but within his country (UK) he is ranked 1 – the highest.
Finding Second Place in the Ranking
Sometimes an analyst is required to find the second or third item in the rank. It could question like – the second most listened song or the third best customer.
In this example I will demonstrate how to find the the second customer with the most amount that lives in Australia:
select
*
from
(
select
rank() over (order by sum_orders desc) as rank_orders,
customer_id,
country,
sum_orders
from
customers
where
country = ‘Australia’
and customer_id is not null
) as tab
where
rank_orders=2
Results
Explanation
I used a query from the previous chapter and filtered the customers from Australia in the where order.
I insert the query inside a subquery and then filtered the second ranked customer.
Please note, there are database engines that allow you to use ‘qualify’ clause.
With the help of the qualify clause it is possible to filter actions performed in a window function. Such filtering eliminates the need to insert of the rank query in subquery:
select
rank() over (order by sum_orders desc) as rank_orders,
customer_id,
country,
sum_orders
from
customers
where
country = ‘Australia’
qualify
rank_orders=2
Finding the last purchase of all customers with rank functions
The Rank command can also help us when we want to create a table that contains the rows with the last execution of a particular action. The following example shows how to create a table with each customer’s last purchase.
For this task I will work with the Orders table.
Query
select
*
from
(
select
rank() over (partition by customer_id order by order_time_stamp desc) as order_rank,
*
from
orders
) as tab
where
order_rank=1
Outcome
Explanation
With the Rank command I asked the database to rank all purchases of each customer when they are ordered in descending order (sort by desc). That means that the last order will appear in the first place in the ranking. After the rank, I filtered the data and chose only the orders that were ranked first (1). These are the last purchese of the customer.
For data analyst freelance services contact me [email protected]