The longest but ultimate guide for being an Automation Rate expert (SAP focus)

The aim of Celonis is to help create superfluid processes and superfluid enterprises. One crucial part of getting closer to this goal is automation. Automation of processes or just single activities creates huge business value by reducing manual labor. However, in more occasions than not analysts struggle when implementing an automation rate (AR) analysis in Celonis.

Unfortunately, for SAP source systems the standard procedure of using the user type for AR calculation cannot be used in many situations. Third party systems, batch jobs and many other hurdles can easily mess up the once cool AR. Altogether, this post tries to tackle this topic by giving a general overview of and guidance on how to calculate the AR plus overcome enterprise specific obstacles.

Topics covered:

I) In a nutshell: The Automation Rate basics

a) Data preparation is key

b) Automation Rate calculation basics

c) Added flexibility with activity selection & removed user types

II) Different customizations regarding calculation of Automation Rate

a) Username

b) Transaction code of activity

c) Time of day of activity

d) Speed of activity

e) Other indicators

III) Further recommendations & best practices

a) Level matters (e.g. PO header vs. PO items)

b) Start small and scale later

c) You can already customize data on database level

d) Third party systems can be problematic

e) Be creative but accurate

f) Make everyone understand the numbers

I) In a nutshell: The Automation Rate basics

a) Data preparation is key

For a proper calculation of the automation rate consider that the exercise even starts earlier in your project. Gathering the right tables and indicators is paramount to be able to compute automation rates. If data is missing on the actual information of automation one hardly can calculate it later. See the P2P examples for SAP ECC below:

b) Automation Rate calculation basics

Let’s assume we have the relevant information in place and can start digging into AR. In most cases the AR is calculated by looking at the user type that performed an activity (E.g. in SAP: USR02.USTYP) with the formula: Activities being automated (executed by specific user types) divided by all activities.

– # Automated Activities
SUM( CASE
WHEN “_CEL_P2P_ACTIVITIES”.“USER_TYPE” IN (<%=automation_usertypes%>)
THEN 1 ELSE 0
*END )1.0/(
– # Automated Activities
SUM( CASE
WHEN “_CEL_P2P_ACTIVITIES”.“USER_TYPE” IN (<%=automation_usertypes%>)
THEN 1 ELSE 0
*END )1.0
+
– # Manual Activities
SUM( CASE
WHEN “_CEL_P2P_ACTIVITIES”.“USER_TYPE” NOT IN (<%=automation_usertypes%>)
THEN 1 ELSE 0
*END )1.0
)

c) Added flexibility with activity selection & removed user types

Can you look at how automated your whole process is? Yes. Should you? Probably not. And here are two reasons why:

On the one hand, if you look at your whole process and include every activity, you include a lot of tasks that cannot be automated. This is not helpful because it skews your data and does not show the actual automation potential.

On the other hand, validation is an effortful task for AR (per activity!) where already the automation of single steps adds huge value. Therefore, only include the activities that can & want to automate in your process.

Often, activities are also connected to a user with the user type NULL. This can happen after a user is deleted from the system (e.g. left the company). Since it is not clear whether these activities were automated or not, may can also be excluded from the calculation. Thus, see the more sophisticated version for AR below:

– # Automated Activities
SUM( CASE
WHEN ISNULL("_CEL_P2P_ACTIVITIES".“USER_TYPE”) = 0
AND “_CEL_P2P_ACTIVITIES”.“ACTIVITY_EN” IN (<%=automation_act%>)
AND “_CEL_P2P_ACTIVITIES”.“USER_TYPE” IN (<%=automation_usertypes%>)
THEN 1 ELSE 0
*END )1.0/(
– # Automated Activities
SUM( CASE
WHEN ISNULL("_CEL_P2P_ACTIVITIES".“USER_TYPE”) = 0
AND “_CEL_P2P_ACTIVITIES”.“ACTIVITY_EN” IN (<%=automation_act%>)
AND “_CEL_P2P_ACTIVITIES”.“USER_TYPE” IN (<%=automation_usertypes%>)
THEN 1 ELSE 0
*END )1.0
+
– # Manual Activities
SUM( CASE
WHEN ISNULL("_CEL_P2P_ACTIVITIES".“USER_TYPE”) = 0
AND “_CEL_P2P_ACTIVITIES”.“ACTIVITY_EN” IN <%=automation_act%>)
AND “_CEL_P2P_ACTIVITIES”.“USER_TYPE” NOT IN (<%=automation_usertypes%>)
THEN 1 ELSE 0
*END )1.0
)

II) Different customizations regarding calculation of Automation Rate

The above-mentioned approach is applicable for many activities and a solid starting point. However, there are several scenarios where this can lead to wrong or misleading results. Just to name one example a dialog user can initiate a batch job. In this instance, the user is counted as the executer of the step whereas the major part was automated. Therefore, some case by case adaptations can help to get the numbers right.

a) Username

Almost every activity in the (SAP) system is connected to the user and its username that performed it. This can be a helpful workaround to calculate a more accurate AR. It can be the case that the actual usernames in the system are pseudonymized. Even then, looking at the usernames can be helpful.

Not pseudonymized:

As all the manual users are known in the enterprise, every activity performed by such a user can be excluded. The remaining activities are counted as being automated.

Here, to include the snippet below in an adapted form may help:

CASE WHEN “_CEL_P2P_ACTIVITIES".“ACTIVITY_EN” = ‘Create PO item’ AND “_CEL_P2P_ACTIVITIES"."USER_NAME” NOT LIKE ‘%atch’ THEN 1.0 ELSE 0.0 END

Pseudonymized:

When usernames are pseudonymized, it gets a little trickier. Here, the characteristics of manual users compared to batch users can be helpful. The volume of performed activities is particularly interesting (e.g. Ask yourself: Has user 14#9AA9915 really done 15 Mio. PO item creations or is this an automated user?)

Next, focus on these usernames and user enterprise specific process knowledge (sample check in SAP) to figure out whether you can label them with certainty as batch users.

As mentioned before, it can become problematic when a dialog user starts a batch job. The dialog user would be connected to significantly higher activity volumes and the automation rate misleading. This is problematic for both, user type and username.

Currently, there is no automatic and foolproof way to identify batch jobs. But there is a useful workaround (heuristic – done in the Celonis newer versions of transformation script):

The aim is to label mass transactions with user type ‘R’ in their user_type column. The following is done to achieve this:

When the …

  • same username performs the
  • same activity within
  • x seconds (e.g. 20) for different
  • document numbers

then it is assumed to be a mass transaction and can be labelled with ‘R’.

b) Transaction code of activity

Every type of activity that is performed is connected to a specific transaction code. The transaction code is the technical abbreviation of the activity in the system. Additionally, certain characteristics can be represented by the transaction code, e.g. whether it was performed by a batch user.

When transaction codes for batch users are known, they can be used to identify automated activities (e.g. by integrating an additional case statement):

CASE WHEN “_CEL_P2P_ACTIVITIES".“ACTIVITY_EN” = ‘Change Price’ AND “_CEL_P2P_ACTIVITIES"."TRANSACTION_CODE” = ‘VA21’ THEN 1.0 ELSE 0.0 END

When transaction codes for batch users unknown, it helps to compare different transaction codes of the same activity with each other. The transaction codes of manual users should be known and can be used to exclude activities. The remaining transaction codes can then be reviewed. Based on the review a assessment whether it indicates an automated activity is possible.

c) Time of day of activity

An essential part of the digital footprints used for process mining is the time stamp of an activity. This also comes in handy when trying to calculate the AR.

Based on enterprise specific working hours, you can identify activities that are automated. Activities that were performed past end of business, e.g. during night hours, can most likely be labelled as automated. Before including this case in your calculation, you should check the working hours of the users included in the case.

CASE WHEN “_CEL_P2P_ACTIVITIES".“ACTIVITY_EN” = ‘Create PO item’ AND HOURS("_CEL_P2P_ACTIVITIES".“EVENTTIME”) >=21 OR HOURS("_CEL_P2P_ACTIVITIES".“EVENTTIME”) <5) THEN 1.0 ELSE 0.0 END

d) Speed of activity

The time stamp cannot only be used to look at when a certain activity took place but also how quickly succeeding activities were performed.

The golden rule is that if a process timestamp is within 20 seconds of the previous process step, the later step is counted as performed automatically. The limitations are that the first process step used for the calculation cannot be classified. Further, slow reaction times of database servers might lead to imprecise timestamps.

CASE WHEN TARGET(REMAP_TIMESTAMPS("_CEL_P2P_ACTIVITIES".“EVENTTIME”, SECONDS), ANY_OCCURRENCE[ ] TO ANY_OCCURRENCE [ ]) – SOURCE(REMAP_TIMESTAMPS("_CEL_P2P_ACTIVITIES".“EVENTTIME”, SECONDS), ANY_OCCURRENCE [ ] TO ANY_OCCURRENCE [ ]) < 20 THEN 1.0 ELSE 0.0

e) Other indicators

The above-mentioned customizations can be quite handy. Nonetheless, there might be further indicators implying that an activity was automated or manual. An example is the activity “Create Purchase Requisition”. As the respective requisition table (EBAN) has a field with the creator of the requisition this field can be used to differentiate the automation level. Nonetheless, this example should just give a glimpse of what options are at the table.

III) Further recommendations & best practices

a) Level matters (e.g. PO header vs. PO items)

After we have explored the 101 of AR calculation, I would like to conclude this topic with some recommendations. First, to calculate an accurate AR, you have to take into account whether an activity was performed on header or item level. This is because activities done on the header level may initiate changes on an item level. But these changes are still associated with the manual user even though they are performed automatically.

A workaround here is to add an activities master data table to the data model to provide the needed information whether an activity was done on item or header level (see this help page for more information on the activities master data table).

b) Start small and scale later

The level of activities matters for validating if steps are automated or not. Yet, also the entire definition of AR is impacted by the level you set up the metric. There is a huge difference in the following three levels, each with a different calculation form:

  • Activity level AR (e.g. 50% of considered activities automation)
  • Item level AR (e.g. 50% of considered PO items fully automated)
  • Header AR (e.g. 50% of considered POs fully automated)

The go to rule is to start small and then scale later when the processes have reached a more advanced level of automation.

  • Calculating the AR for just one activity is straightforward (#activity automated / # activity happening overall). This is already a good starting point reducing validation and making quick wins available.
  • Second, a considered set of activities define the entire automation rate. With exclusion of certain steps this approach is also rather straight forward (#activities automated / # activities manual or automated)
  • As you reach higher levels of automation (“case level”), rules need to be defined when e.g. a PO item can be seen as automated. This means that there is a set A of activities that have to be automated and a set B of activities that do not need to be automated. The PO item is seen as automated when at least all the activities in set A are automated (# of cases conformant to automation rules / # of overall cases).
  • This is also applied to the next higher level of POs. There, e.g. all PO items need to be automated so that a PO can be counted as automated.

c) You can already customize data on database level

Whether or not an activity is automated based on the various customizations from above can already be indicated in the CEL_XXX_ACTIVITIES table by extending it. This field “AUTOMATION” clearly indicates automation and is filled by considering different workarounds:

workaround a
workaround b

d) Third party systems can be problematic

As you might have noticed we have predominantly talked about one SAP system being connected to visualize a process flow with activities. The situation can get a bit more complex if you add third party systems to your leading source system (e.g. SAP Ariba, Readsoft, Coupa, Salesforce, Jaggaer, …). In case the information is only written back to the leading system and automation information is lost you might either consider the above-mentioned customizations (e.g. time between activities) or exclude those steps from AR calculation. This upfront consideration can save you a lot of discussion, time and frustration – believe me.

e) Be creative but accurate

Besides the already mentioned customizations from above, there are also other indicators that can help to calculate the AR. These indicators are mostly enterprise and process specific and might not be known at the beginning of the project.

Therefore, there will be various small changes to the calculation of the AR during the project. Altogether they add up to holistic approach to calculate the AR of activities or processes. Often, domain knowledge is key to identify such additional rules. They can e.g. include that someone knows that when X was done manually then Y and Z were automated or that a certain document type is always automated. It is important to be creative here. But even more important is to be accurate and know for sure that these rules are reliable and can be applied to the data. This should be validated and documented after implementation.

To get a head start, you can send out a short survey to the domain experts asking for such specific rules that indicate automation.

f) Make everyone understand the numbers

A very important aspect concerning the AR is to understand how it was calculated, why it was calculated in that way and what the final numbers presented mean. This needs to be stressed! Otherwise, it can happen that certain actions are triggered in an enterprise, but the underlying assumptions are not correct.

Additionally, documentation is relevant throughout any software implementation. Yet, for AR calculation it is paramount. Imagine you may have used all customizations mentioned above, excluded activities, changed the level to item level (not activity level). No business user on earth would be able to comprehend what the AR displays. It is crucial that everyone is using the same definitions when talking about the AR and its calculation. Thus, I highly recommend adding proper documentation to the analysis (either external link, info button or an entire documentation page). The extra effort pays off in the long run and increases acceptance significantly!

I hope this post helps in some way. Nevertheless, this is just a subjective view and a starting point for discussion. Happy to hear your feedback!

PS: See also our other guide regarding pull functions HERE.

8 Likes

Hi Timo, thanks for the detailed guide!

We also get frequently asked which activities organizations could actually automate (in SAP). Although from Celonis side we cannot give 100% guidance in any custom scenario, after seeing several customer processes we see some patterns emerging. See below a short summery of what came to our minds regarding the 4 standard processes:

Purchase 2 Pay Order 2 Cash Accounts Payable Accounts Receivable
Clear Invoice Approve Credit Check Clear Invoice Clear Invoice
Create PO Create Delivery Enter in SAP Create Invoice
Create PO Item Create Delivery Document Record Invoice Record GI
Create Purchase Requisition Create Invoice Remove Payment Block Send Overdue Notice
Delete Purchase Order Item Create Shipment VIM activities
Record GR Create SO
Record IR Create SO Item
Record Order Confirmation Extend COTD
Send Message PO Receive Order
Send PO Record GI
Transmit PO Release Credit Hold
Remove Delivery Block
Send Invoice
Send Order Confirmation

All in all it seems like the SAP based automation of activities is focused on the creation and recording of certain items or developments.

Please see this overview neither as complete nor as accurate or official, more like a starting point for a discussion.

What is your experience with automation and process mining? Happy to hear your thoughts!

Best Regards
Manuel from Celonis

1 Like