Decision Tree in Celonis

Hello Team,

I noticed the decision tree function seems not available. Is it true or just the blue color didn’t show?

Best Regards,
Chen Lei

Hi again Chen Lei :slight_smile:

Its normal that the function isn’t written in blue if it doesn’t appear in the PQL reference. That doesn’t mean it doesn’t work (I tested it myself just to make sure), which functions are available depends on the version of Celonis your using. If you access the help pages over or directly in Celonis 4 then you can see for which versions of Celonis each function is available. The DECISION_TREE function is available for Celonis 4.3, 4.4 and 4.5. However, as your company uses a custom version of Celonis it’s possible that you don’t have it. If it’s not available for you then you should see an “Invalid operator” error message.

I hope this answers your question.

Best wishes,


Hi Calandra,

Thank you for explanation.

I tried with our own data but it only shows null value to me.

I also tried with the example in PQL Function Library, but it was always loading and didn’t show me “Invalid operator” message.

Here’s the link:

Could you help investigating please?

Best Regards,

Chen Lei



Hi Lei,

I unfortunately can’t follow your link. Could you please post screenshots?

Has the function ever returned null values to you, or just never stopped showing “…”? If it only shows “…” then that could be because it just takes a very long time to calculate. Machine Learning functions like Decision Trees always take much more time and computing power to calculate than other PQL functions, especially with large datasets. How many rows does your table1 have? How long have you tried to leave it to calculate? Knowing this would make it easier to gage whether it’s taking an excessively long time.

I would recommend filtering your analysis down to a fraction of the original size while you get the model working correctly to avoiding needing to wait for the model to finish calculating while your still making changes to it.

What dimension do you want to aggregate by? You may have to add these into EXCLUDED (table.exclude_column, … )”. It also looks like your model is only calculating based on the dimension TABLE1.X, the model would work better if you add extra features to INPUT such as for example MAX(“EKPO”.“NETWR_CONVERTED”) or COUNT(“EKKO”.“EBELN”). I also noticed that you seem to be predicting the same data your trained the model on which can lead to exaggerated model performance compared to using the model on fresh data, which gives a more realistic picture of the models ability to correctly categorize.

If your input columns have a lot of NULL values that could also cause problems.

If you give us more information about your data and the model your trying to build, we can try help you further.

Best wishes,