Hi, you should make use of the index order function for this variable.
The INDEX_ORDER returns a column with integer indices, starting from 1. The indices indicate the order of the rows. The INDEX_ORDER function creates an INT index column. The index column contains a unique integer value for every row of a required reference input column.
https://help.celonis.de/display/PQL47/INDEX_ORDER
Variable = SUM(CASE WHEN INDEX_ORDER ( "Table"."Column" , ORDER BY ("Table"."Column" DESC) ) = 3
THEN "Table"."Column" ELSE NULL END)
Let's look at the example below for a better understanding
The OLAP table has two columns
- Sales Order Document - "VBAK"."VBELN"
- Net Order Value - "VBAK"."NETWR"
- Index Order - INDEX_ORDER ( "VBAK"."NETWR" , ORDER BY ("VBAK"."NETWR" DESC) )
We have sorted the table based on net order value descending. In the 3rd column, we have built a KPI for showing the index order of the net order value column sorted in descending order.
So for this example the third highest corresponds to sales order document VV197845 with the net order value of 7419448. We would want to store the net order value for this document in a variable.
For this we define the variable 'third_highest' as follows
third_highest = SUM(CASE WHEN INDEX_ORDER ( "VBAK"."NETWR" , ORDER BY ("VBAK"."NETWR" DESC) ) = 3 THEN "VBAK"."NETWR" ELSE NULL END)
Once this is done we can use this in a single KPI component as I have done or in anywhere else as per business need. Hope this helps!
Hi Subhayu,
thanks a lot for your extensive answer!
I have a few follow-up questions, as your formula doesn't work on my end so far:
- How does your variable-code 'know' that you want to have an index based on the Sales Order Document - "VBAK"."VBELN" column? It coult be possible that you want to group the net order value by any other column, like for example Sales Region, right? However, I don't see Sales Order Document - "VBAK"."VBELN" specified in your code.
- How would you proceed to replicate your code for a categorical column that must be counted, instead of summed up? Let's say I would like to calculate the third largest Count of Order Items per Sales Region. Simply changing "SUM" to "COUNT" doesn't work, as the index order is determined by the value written in the "VBAK"."NETWR" column.
Thank you in advance!
Hi Andre, sorry for the delay. Let me answer your questions.
So, when we are using this index order formula we have the option to define what the sorting and partitioning should be. The PQL goes something like this --
INDEX_ORDER ( column , ORDER BY ( sort_column sorting], ... )] , PARTITION BY ( partition_column, ... )] )
However, for the example in the OLAP table I shared, there is no requirement of partitioning if we want a cumulative count of rows, because that is what I thought was your requirement based on the question you asked. However, if we were to partition the index order based on "VBAK"."VBELN", we would have got something like this.
Index Order - INDEX_ORDER ( "VBAK"."NETWR" ,ORDER BY ("VBAK"."NETWR" DESC),
PARTITION BY ( "VBAK"."VBELN" ))
The index order returns 1 for all entries because all the sales document numbers are distinct. It will only return 2 or 3 if the document repeats, since we are partitioning based on that. If you don't go for a partition the highest index order returned will be 419237 which matches with the count of sales order documents since each of them are distinct in this example.
However, suppose you would want to partition your data based on another field which is not distinct. Let us take the example of Sales Organization which is "VBAK"."VKORG". There are 10 possible distinct values for this.
Index Order - INDEX_ORDER ( "VBAK"."NETWR" ,ORDER BY ("VBAK"."NETWR" DESC),
PARTITION BY ("VBAK"."VKORG"))
For your next question regarding a categorical column that must be counted, I think there are better ways to do this other than using a index order function. You could just group the categories into a KPI field using CASE WHEN statements and then use this as a dimension for grouping the net order value and then represent it in a column chart or pie chart. Visually one should be able to tell which is 3rd highest.
Hi Andre, sorry for the delay. Let me answer your questions.
So, when we are using this index order formula we have the option to define what the sorting and partitioning should be. The PQL goes something like this --
INDEX_ORDER ( column , ORDER BY ( sort_column sorting], ... )] , PARTITION BY ( partition_column, ... )] )
However, for the example in the OLAP table I shared, there is no requirement of partitioning if we want a cumulative count of rows, because that is what I thought was your requirement based on the question you asked. However, if we were to partition the index order based on "VBAK"."VBELN", we would have got something like this.
Index Order - INDEX_ORDER ( "VBAK"."NETWR" ,ORDER BY ("VBAK"."NETWR" DESC),
PARTITION BY ( "VBAK"."VBELN" ))
The index order returns 1 for all entries because all the sales document numbers are distinct. It will only return 2 or 3 if the document repeats, since we are partitioning based on that. If you don't go for a partition the highest index order returned will be 419237 which matches with the count of sales order documents since each of them are distinct in this example.
However, suppose you would want to partition your data based on another field which is not distinct. Let us take the example of Sales Organization which is "VBAK"."VKORG". There are 10 possible distinct values for this.
Index Order - INDEX_ORDER ( "VBAK"."NETWR" ,ORDER BY ("VBAK"."NETWR" DESC),
PARTITION BY ("VBAK"."VKORG"))
For your next question regarding a categorical column that must be counted, I think there are better ways to do this other than using a index order function. You could just group the categories into a KPI field using CASE WHEN statements and then use this as a dimension for grouping the net order value and then represent it in a column chart or pie chart. Visually one should be able to tell which is 3rd highest.
Hi Subhayu,
thanks again for your explanation!
Unfortunately, I can't use tables in this use-case:
I would like to save the 3rd largest number of sales order items (grouped by country) as a variable, so I can use it as a coloroing threshold in a heatmap.
Subsequently, I would repeat the logic for the 5th largest or 10th largest number of sales order items per country, to add more thresholds.
I managed to recreate your logic and created an index table like this:
Index - Country - Count(Sales Order Items)
1 - DE - 4.567.243
2 - FR - 3.557.123
3 - ES - 456.809
4 - CN - 120.558
...
However, I still struggle to create a variable that returns 456.809.
Furthermore, my company uses an older version of PQL, in which the INDEX_ORDER function looks like this:
I always translated your functions into this format, but is it possible that 'GROUP' and PARTITION_BY' are not equivalent arguments...
Thank you!