Celonis uses a kind of Fuzzy Miner to show the results. This miner provides for every activity and every edge between two activities the occurence ratio regarding the whole data set.
I highlighted whole data set, since this is a difference with the most occuring variance. For instance, if the most occuring variant is A-B-C-D-E (50%) and the second most occuring variant is A-B-C-D (30%) and the third is A-B-C (20%), it is still possible that the Process Explorer shows A-B-C, since those three steps occur the most over the complete data set (being 100%), while E is 50% and D = 50+30=80%.
In your situation, it is likely that you have multiple variants that occur quite often. It is technically possible that the most occuring variant is A-B-C-D (40%), but the follow-up variants are A-B-C-E-F (30%) and A-B-C-E-G (30%). In this example, activity E (30+30=60%) occurs more often than activity D (40%), and is therefore displayed in the Process Explorer as next step, while not being part of the most occuring variant.
I hope this explains.
Celonis uses a kind of Fuzzy Miner to show the results. This miner provides for every activity and every edge between two activities the occurence ratio regarding the whole data set.
I highlighted whole data set, since this is a difference with the most occuring variance. For instance, if the most occuring variant is A-B-C-D-E (50%) and the second most occuring variant is A-B-C-D (30%) and the third is A-B-C (20%), it is still possible that the Process Explorer shows A-B-C, since those three steps occur the most over the complete data set (being 100%), while E is 50% and D = 50+30=80%.
In your situation, it is likely that you have multiple variants that occur quite often. It is technically possible that the most occuring variant is A-B-C-D (40%), but the follow-up variants are A-B-C-E-F (30%) and A-B-C-E-G (30%). In this example, activity E (30+30=60%) occurs more often than activity D (40%), and is therefore displayed in the Process Explorer as next step, while not being part of the most occuring variant.
I hope this explains.
Hi Jan-peter,
many thanks for your reply! Following your explanation above and some other info we have collected, we have done some small tests to really verify the logic behind, but the result is kind of more confusing. For example, we have created a test eventlog table such as shown below. It has 9 different cases, the variants are like:
- a1 a2 c
- a1 a2 a3 c
- a1 a2 a3 c
- a1 a2 a3 d
- a1 a2 a3 e
- c b
- c b
- d b
- e b
Based on the official documentation from Celonis, we would expect the baseline graph starts with A1 and C (B is the most often occurred ending activity, but there is no variant connecting A1 with . Following your reply above, then probably we should expect something like A1-A2 or A1-A2-A3 as the baseline graph. However, the result in Celonis shows something completely different, C-B, such as shown in the 2nd screenshot below.
Any ideas?