Hello Benedikt,
In the machine learning workbench you can use the Celonis API (pycelonis). Please find here how it works:
https://python.celonis.cloud/docs/pycelonis/en/latest/notebooks/01_Pulling_data.html
Hi Benedikt,
At the moment pulling full tables is only possible from an Analysis, as shown in the link that Paul shared.
There is an experimental function in Transformation to run SQL statements from Python,
get_data_frame
:
https://python.celonis.cloud/docs/pycelonis/en/release-1.2.0/reference/pycelonis.objects_ibc.Transformation.html
It returns up to max. 100 rows of your SQL statement if it is a SELECT statement.
Best,
Simon
Hi Paul, Simon,
thanks for your insights.
What I would like to do is process data in the ML WB from tables that I dont actually need/want to have in the data model in their raw format. This would help me keep the data model small and prevent analysts from accessing tables in the workspaces they are not supposed to.
Is there another more or less clean solution to tackle this?
My ultimate goal would be to first process data from raw tables in the ML WB and afterwards load this data into a data model for use in the analyses.
Best,
Benedikt
Hi Benedikt,
At the moment the options I mentioned are the only options. Thanks for your input! We will take it into account.
Best regards,
Simon
Hi Simon,
thanks for your answer. Looking forward to seeing such a feature in ML WB.
Best,
Benedikt
Hello,
(You may already solve this issue, but) I faced same issue and solved it.
From ML workbench I called loop of get_data_frame like below.
select mandt,vbeln,posnr from vbap order by mandt,vbeln,posnr limit 100 offset 0
got DF of 1st to 100th
select mandt,vbeln,posnr from vbap order by mandt,vbeln,posnr limit 100 offset 100
got DF of 101th to 200th
select mandt,vbeln,posnr from vbap order by mandt,vbeln,posnr limit 100 offset 400
got DF of 401th to 490th
select mandt,vbeln,posnr from vbap order by mandt,vbeln,posnr limit 100 offset 500
got blank DF
Adding 100 to offset until blank DF received.
Sorting data by ORDER BY (key fields) is always required to get DF correctly.
Best regards,
Kazuhiko
Hi there,
is there still no way to pull tables from a datapool into the ML Workbench without loading these tables into a datamodel in the first place?
Best regards,
Eike
Hi Eike,
Unfortunately we do not offer the functionality to directly pull tables from datapools into ML Workbench. May I ask what is your usecase here?
Best,
Noor
Hi Noor,
i tried Kazuhikos workaround and it worked for me.
Thanks.
Hi All ,
I Used the @s.riezebos experimental function and later applied loop as explained by @kaztakata it worked for me .
Thanks for the explanation .Very useful.
Thank You,
Amruth Muddana
Hi Noor,
The usecase for me is for data cleaning purposes. MySQL Vertica is extremely limited in its functionalities, while python is very useful to prepare customer data from within Celonis (Data Security). For us (Apolix) it is critical to have the possibility to get the full dataset from the Event Collection - can you please help me out?
Hi @JHermans ,
Can you please explain further what kind of data cleaning you are aiming for that is very hard or even not possible to do in SQL and python can solve this problem quickly?
For instance, an example of what cleaning approach are you aiming for and how will the data in the said cleaned form will help you would be great.
This will help us formulate the request better.
Best,
Noor
Hi Noor,
Thanks for your quick reply. First of all, this is for PoVs and implementations, where I dont have access to Excels. For these large datasets, quite often a lot of data cleaning is required. The usecase I have now it the fact that the datetimes in the original data are not read correctly by Celonis (due to format), I have to upload the data as Strings (VARCHAR), however CAST(xxxx TO DATETIME) basically does not do the job. From within Python I am able to handle data in a more structured and scalable way.
To further explain, maybe lets have a call about this. You can reach me at joost@apolix.nl
Hi,
I will follow up with you over the email.
Thanks
Best,
Noor