Feature of the week series #20 - Process Regex

Dear Celonis Community!

How do you find cases that follow a complex activity pattern?

This “Feature of the week” post explains how you can use the Match_Process_Regex() PQL function to achieve exactly this: Assign a label to cases that match an activity pattern.

1. Getting to know the Match_Process_Regex() function

How does the Match_Process_Regex() look like? The PQL syntax is

MATCH_PROCESS_REGEX( Activity_Table.Column , regular expression )

with the two inputs

  • Activity_Table.Column: any column from the activity table*
  • regular expression: a pattern of activities**

Pattern of activities (or regular expression) is the rule, that the entries from the specified activity column should follow. If we choose the activity name column from the activity table, this is how an example activity pattern could look:

  • The case starts with the activity ‘A’
  • Activity ‘A’ is followed by ‘B’ or ‘C’
  • The case ends with the activity sequence ‘Y’ directly followed by ‘Z’

So a pattern is just a description what sequence your activities should follow.

The variant A, B, Y, Z matches the above pattern.
The variant A, M, Z, Y does not match this pattern, as ‘A’ shall be followed by ‘B’ or ‘C’ and the case should end with ‘Y’ followed by ‘Z’.

Match_Process_Regex() let’s you find these cases in Celonis. You just have to codify the above pattern:

MATCH_PROCESS_REGEX( Eventtable.Activity_Name , ^ 'A' >> ['B','C'] >> (ANY)* >> 'Y' >> 'Z' $ )

Let’s take a step back and dissect the codified regular expression step by step:

  1. ^ 'A' means that the case starts with activity ‘A’.
    ^ is the start identifier. 'A' is the name of the activity as it occurs in the "Activity_Name" column of the Eventtable.
  2. >> is the PQL sign for “directly followed by”.
    So the activity 'A' is directly followed by the expression inside the brackets []
  3. ['B','C'] means either activity 'B' or 'C' occur directly after 'A'
    [] is an OR collection of activities.
  4. (ANY)* means any activity can occur any number of times.
    ANY is a placeholder for any activity.
    ()* says that the activity inside the brackets can occur 0 or more times
  5. 'Y' >> 'Z' $ means that 'Y' is directly followed by 'Z' at the case end.
    $ is the case end identifier.

What just happened? Above, you just learned that there are many different pattern operators like ^ and >> and that you can use them to your advantage, because they let you codify the complex patterns you have in mind. Having the pattern codified, the MATCH_PROCESS_REGEX() function will return 1 for the cases that match, and 0 for the cases that don’t match this pattern. This allows you now for example to filter the Variant Explorer to all cases that match your pattern.

Before looking at it in action, you can see the entire pattern syntax and examples in the help (IBC, on-premise, sap on-premise).

2. Seeing it in action

Click here and navigate this analysis to get a feeling how the MATCH_PROCESS_REGEX() function can be used.

What complex patterns do you have in mind? Happy Matching!
Your Celonis Product Team

*) You can use any string column from your Eventlog. For example, you can also match patterns for user types, like ( Eventlog.User_Type, ‘Batch’ >> ‘Batch’ >> ‘Manual’, …) !
**) Unlike regex functions that you might already know, this regex does not refer to letters inside a word. It refers to activities inside a case. So the pattern you can search for is not ‘W’ >> ‘o’ >> ‘r’ >> ‘d’ to find the activity named ‘Word’ but rather ‘Order pizza’ >> ‘Receive pizza’ >> ‘Eat pizza’ to find the process case of that order.