# Average throughput time for repeating activities per case

#1

Hello everyone,

i want to calculate the average throughput time between (1) Revision and “Passed” and (2) “Revision” and “Declined”. But I have 2 problems: 1) It can appear that multiple revisions per case took place, 2) there can be some other activities between the Revision and the “Passed” or “Declined” . So for example
Activity1->Activity2->Revision->Passed->Activity3->Revision->Declined
Activity1->Activity2->Activity3->Revision->Activity4->Declined->Activity4->Revision->Declined
So in my avg. throughput time calculation I want to see the 3 durations for situation (2) and one for the situation (1).

Any ideas?
Thanks!

0 Likes

#2

Hello,

As I’m sure you have already noticed the difficulty in your problem lies in the fact that it is only possible to select ‘first’ or ‘last’ occurrence of each activity when calculating the throughput time between activities. I have now filed a feature request with our development team on your behalf for a ‘next occurrence’ option to be introduced. However, in the meantime here is a possible work around.

It is however worth noting that this solution only works if at most ‘Revision’ -> ‘Declined’ happens twice per case (like it does in your example). I would recommend calculating both the average time difference between the first occurrences of A and B and the average time difference between the last occurrences of A and B. If you add these two averages together and divide by 2 you get the overall average. In those cases where there is only one occurrence of “Revision -> Declined” then the first and last occurrence of each activity is the same however this won’t affect the overall average. Here is the code I had in mind:

(AVG(CALC_THROUGHPUT(FIRST_OCCURRENCE[‘Revision’] TO FIRST_OCCURRENCE[‘Declined’], REMAP_TIMESTAMPS("ACTIVITY TABLE"."EVENTTIME", DAYS))) +

AVG(CALC_THROUGHPUT(LAST_OCCURRENCE[‘Revision’] TO LAST_OCCURRENCE[‘Declined’], REMAP_TIMESTAMPS("ACTIVITY TABLE"."EVENTTIME", DAYS)))
)/2

If there are cases where ‘Declined’ is logged before ‘Revision’ then these cases could be filleted out using Process (not) equals conditioned. Which while not ideal shouldn’t have a huge effect on the overall average.

I hope this helps,

Best wishes,

Calandra (Celonis Data Science Team)

0 Likes

#3

Hi, Maria!

There is no easy formula AFAIK. You can try following ugly construct:
DAYS_BETWEEN(
PU_FIRST(“cases”, “events”.“activity”, ACTIVATION_COUNT(REMAP_VALUES(“Table1”.“ACTIVITY”, [‘Revision’, ‘Revision’], NULL)) = 1),
PU_FIRST((“cases”, “events”.“activity”, ACTIVATION_COUNT(REMAP_VALUES(“Table1”.“ACTIVITY”, [‘Declined’, ‘Declined’], [‘Passed’, ‘Passed’], NULL)) = 1)
) +
DAYS_BETWEEN(
PU_FIRST(“cases”, “events”.“activity”, ACTIVATION_COUNT(REMAP_VALUES(“Table1”.“ACTIVITY”, [‘Revision’, ‘Revision’], NULL)) = 3),
PU_FIRST((“cases”, “events”.“activity”, ACTIVATION_COUNT(REMAP_VALUES(“Table1”.“ACTIVITY”, [‘Declined’, ‘Declined’], [‘Passed’, ‘Passed’], NULL)) = 3)
) +
DAYS_BETWEEN(
PU_FIRST(“cases”, “events”.“activity”, ACTIVATION_COUNT(REMAP_VALUES(“Table1”.“ACTIVITY”, [‘Revision’, ‘Revision’], NULL)) = 3),
PU_FIRST((“cases”, “events”.“activity”, ACTIVATION_COUNT(REMAP_VALUES(“Table1”.“ACTIVITY”, [‘Declined’, ‘Declined’], [‘Passed’, ‘Passed’], NULL)) = 3)
)

Unfortunately have to time to test on real data. It may be necessary to check if PU_FIRST return NULL before calculation DAYS_BETWEEN. Idea behind it, is that you start with first occurrence of activity pairs, then second, then third. Add as many as you assume is possible to encounter in the process. It is possible

0 Likes

#4

Hi Maria

It turns out there is a solution to your problem. You could use the little known PQL functions SOURCE and TARGET. The credit for this solution goes to my colleague David Becher.

The SOURCE / TARGET operators where developed for use cases such as this one where relationships between activities need to aggregated across all instances and all cases.

here is how you could calculate the average difference in time for Revision -> Passed and Revision -> Declined respectively:

1. Create an OLAP table with dimensions containing the following code:

1.1 SOURCE(“ACTIVITIES”.“ACTIVITY”, CASE WHEN “ACTIVITIES”.“ACTIVITY” = ‘Revision’ OR “ACTIVITIES”.“ACTIVITY” = ‘Passed’ OR “ACTIVITIES”.“ACTIVITY” = ‘Declined’ THEN ACTIVITIES".“ACTIVITY” ELSE NULL END)

1.2 TARGET("ACTIVITIES"."ACTIVITY")

2. Filter the table so only the connections Revision-&gt;Passed and Revision-&gt;Declined are shown:

FILTER SOURCE(“ACTIVITIES”.“ACTIVITY”, CASE WHEN “ACTIVITIES”.“ACTIVITY” = ‘Revision’ OR “ACTIVITIES”.“ACTIVITY” = ‘Passed’ OR “ACTIVITIES”.“ACTIVITY” = ‘Declined’ THEN ACTIVITIES".“ACTIVITY” ELSE NULL END) = ‘Revision’;

FILTER TARGET(“ACTIVITIES”.“ACTIVITY”) = ‘Passed’ OR TARGET(“ACTIVITIES”.“ACTIVITY”) = ‘Declined’;

With this approach only the time difference between a ‘Revision’ activity and the next ‘Passed’/’Declined’ activity is included in the aggregation, even if there are other activities in between. This is because the default ‘edge configuration’ is used where the activities are automatically linked to the activity coming immediately afterwards for purposes of the aggregation. The reason why it is possible for other activities to take place between Revision and Passed/Declined is that the case when statement in the SOURCE dimension gives all activities other than ‘Revision’, ‘Passed’ and ‘Declined’ the value NULL so the link skips them connecting to the next relevant (non null) activity. Its also worth noting that the activities in the Source column always come before those in Target column. I.e. the time between the first ‘Declined’ activity and the next ‘Revision’ activity is not included in the calculation of the average time difference.

I hope this helps,

Best wishes,

Calandra (Celonis Data Science Team)

4 Likes

#5

Thank you both for the reply!
@Calandra - can u suggest how to modify the code so that I don’t track Revison-to-Revision durations?

My OLAP-Table looks like
SOURCE TARGET AVG Duration
Revision Revision X Days
Revision Passed Y Days
Revision Declined Z Days

Thank you

0 Likes

#6

So basically u need to divide by the number of activations at the end?

0 Likes

#7

Yes. I’ve checked formula and it works.
Use SUM( DAYS_BETWEEN() + DAYS_BETWEEN() + … )/COUNT( number of cases). Each of the DAYS_BETWEEN() statement will give you time between first, second and so on pairs of activities. So you can sum them up.

0 Likes