Create your Eventlog from your tables: quick and easy

The Snap team is always trying to improve the user experience of our users and simplify Process Mining in order to allow them to analyse more processes.

In the past two weeks the only thing we could think of was: what if we could develop a feature that allowed our users to generate their Event Logs from raw data in an easy and intuitive way? This would eliminate the frustration of our users that they don’t know what an event log is and where to find it.

For instance, let’s have a look on the data coming from a Service Desk Application and how can we generate an Event Log. Most of the times such applications consist of two main tables the “Tickets” and “Ticket changes”.

Tickets table consists of columns such as: ID, Created at, Updated at, Commented at, User and Category.

0

Tickets changes table consists of columns such as: ID, Status changed at, Assigned at, Deleted at, Department etc.

0

So if we had a feature that would automatically convert these tables into an Event Log then it would look like this:

0

We created a script that can do this for you in case you need it. It comes with a guide how to use it: It’s super easy to use.

Now theoretically we could upload this event log to Snap and start analysing our process as you can see below.

Did the script help you? Let us know!

The link to the tables:

The python script including a how-to use guide:

1 Like

Hi kentzler,

Thanks for the script, it will help me a lot.

Here are some other improvements that can be done in the importion process of Snap. The disadvantage of Snap is that you need a perfect .xls or .csv file to do the importation.

  • Add a start/stop datetime for each activity. This could help to measure the time of the activity.
  • Automatically remove lines if the date is missing.
  • Automatically set the type to string if a string is found during the importation.
  • Automatically set the attributs to null if they are missing.
  • Allow the aggregations of columns (by example colume date + time or Activity number + name) to create Timestamp or Activities.
  • Allow the possibility to ignore some columns.

Hope it’ll help for next releases.

2 Likes

Could you please post the script with a download link from your servers? Due to security concerns, our co has banned all 3rd party cloud file shares.

You can find more examples in our documentation within Celonis Snap.

Here is the script:

How to create my Event Log?

NOTE: For the moment, this script only works for CSV files.

  1. Go to Jupyter Org website: https://jupyter.org/try
  2. Choose the second tile option “Jupyter Lab”
  3. Upload your CSV files that you want to convert to an event log by clicking the arrow pointing upwards icon. (Top left, on the Sidebar)

  1. Create a new Python3 Notebook by clicking the plus button

  1. Paste the python script (below) in the new notebook.
  2. Follow the Comments on the code. You’ll need to modify the name of the files to match the files you uploaded (marked with pink)
  3. Finally, modify the name of the columns that contain the timestamps (marked with orange)
  4. Run the cell by clicking the Play button.
  5. Your Event Log will appear on the sidebar with the name event_log.csv
  6. You can now download the .csv file and upload it to Celonis Snap!

The python script

import pandas as pd

import os

#Step one: read your files

tickets = pd.read_csv(os.getcwd()+"/"+"data - tickets.csv")

tickets_changes = pd.read_csv(os.getcwd()+"/"+"data - tickets_changes.csv")

#Step two: merge your files on your ID column. (This requires that your files have the ID column)

df = tickets.merge(tickets_changes, on='ID')

#Step three: Define the columns that represent your activities from your files. Hint: look for the columns that store dates or timestamps

columns = ["Created at", "Updated at", "Deleted at", "Commented at", "Status changed at", "Assigned at"]

#Run it to generate your event log

df["Activity"]=str

df_new = pd.DataFrame(columns = [col for col in df.columns if col not in columns])

for col in columns:

for (idx,row) in df.iterrows():

if not pd.isna(row.loc[col]):

row["Activity"]=col

row["Timestamp"]=row[col]

df_new = df_new.append(row)

df_new = df_new.drop(columns=columns).reset_index(drop=True)

df_new.to_csv(os.getcwd()+"/"+"event_log.csv", index=False)