Skip to main content

I wanted to get the extraction logs (like date, info, message). And I wanted to get this data through an API (Pycelonis) so I can post it elsewhere.

Hi,

 

I think that the only option is to send api_request from Pycelonis in order to retrive internal API responses.

 

Syntax for getting specific extraction log would be something about this:

 

First you need to get status and jobs IDs

1a) GET https://<team_name>.celonis.cloud/integration/api/pools/<DataPoolID>/logs/status

 

1b) GET https://<team_name>..celonis.cloud/integration/api/pools/<DataPoolID>/jobs

 

Second step would be getting more data about each job and see executions

2) GET https://<team_name>.try.celonis.cloud/integration/api/pools/<DataPoolID>/jobs/bf6d23b7-85b4-419e-ae0b-bcd7975c1b50/loads

 

Third step is to get data about specific execution id events, I'm not sure how to get that <id> item - <ExecutionID> probably may be taken from <latestExecutionItemId> in step 2

3) GET https://<team_name>.celonis.cloud/integration/api/pools/<DataPoolID>/logs/executions/detail?executionId=<executionid>&id=<id>&type=TASK&limit=200&page=0&logLevels=DEBUG%2CINFO%2CWARNING%2CERROR

 

best is to first create chain of links with browser and then write python code using standard api_request code snippet, of course DF can be easliy uploaded into data pool as new table. Probably you'll also need to implement page scrolling, as usual limit for page size is 200 items - if log will be longer you need to loop through pages. If you will find optimal and working solution, please share with us :)

 

Code snippet:

from pycelonis import get_celonis

import pandas as pd

 

#Getting data

URL = "<Specific_URL_Link>"

celonis = get_celonis(url='https://<team_name>.try.celonis.cloud',key_type='APP_KEY', permissions=False)

json = celonis.api_request(URL, message=None, method='auto', timeout='default', get_json=True)

df = pd.DataFrame.from_dict(json_'<dictionaryKey>'], orient='columns')

 

#Pushing data to CElonis

data_pool = celonis.pools.find("<DataPoolID>")

data_pool.create_table(table_name="<NewTableName>",

                      df_or_path=df,

                     if_exists="drop")

 

#Viewing data frame

df

 

Best Regards,

Mateusz Dudek

 

 

 


Hi,

 

I tried your solution but I am getting an error for the line

--> json = celonis.api_request(URL, message=None, method='auto', timeout='default', get_json=True)

 

PyCelonisHTTPError and PyCelonisHTTPError: Got unexpected HTML document response.

 


Hi,

 

In url I can see that you've used specific syntax for url - that one will return only html code which cannot be rendered properly as JS wont work as intended with such request.

 

That specific URL is used when navigating to specific execution logs, and you can se it didn't contain "/integration/api/" in the URL - in order to get data you want you need to send requests to internal API responsible for storing information for interface renderer.

 

https://<Team-URL>.celonis.cloud/integration/ui/pools/<DataPool>/data-configuration/data-jobs/<datajobID>/executions/<executionID>/logs%3C/span%3E%3Cspan

 

If you'll exchange values withing <> characters and use one of my link you'll see JSON viewer in browser with a raw data, with can be retrieved using celonis.api_request function.

 

Usually you can obtain these request URLs by using "network" tab as the part of developer tools in browser and checking 200 GET requests with specific names f.e. "logs" "jobs" etc.

 

Best Regards,

Mateusz Dudek


Hi,

 

I have tried the URL syntax and it is still showing same PyCelonisHTTPError

 

imageimageThanks & Regards,

Vijayaraja J


Also in the marketplace you can find an app for Data Pipeline & Consumption Monitor. But you need to have a Monitoring Data Pool for it work. So if we can use the API we can bypass the creation of the Monitoring Data Pool.

 


Also in the marketplace you can find an app for Data Pipeline & Consumption Monitor. But you need to have a Monitoring Data Pool for it work. So if we can use the API we can bypass the creation of the Monitoring Data Pool.

 

Hi,

 

1) After enabling custom monitoring in administrator options, Monitoring Data Poll is auto generated and should be visible in a day or two, then you can use the App. The only problem is that it contains only very basic information like start timestamp of the extraction, end timestamp, status if that was successful.

 

2) In error you've provided I can see URL looks like:

obrazIt's not correct syntax as it returns html document which cannot be transformed into JSON object or DataFrame and pushed to celonis. You can even see that it's not API URL as it has /integration/ui/pools pieces and not /integration/api/pools/. Additionally it contain some garbage like "styles:color:rgb (...)" at the end.

 

As mentioned earlier, first send request to:

 

URL = "https://capgemini.in-sandbox.demo.celonis.cloud/integration/api/pools/9359a45a-df4f-4241-aa07-af84073b7c81/logs/status"

 

Best Regards,

Mateusz Dudek

 


Log events are retrieved in ascending order based on their timestamp values as depicted in the following response:

 

{

"meta": {

"page": {

"after": "eyJhZnRlciI6IkFRQUFBWFVBWGQ5MU05d3lUZ0FBQUFCQldGVkJXR1E1TVZaclFtRnpkRVoyVEc5QlFRIn0"

}

},

"data":

{

"attributes": {

"status": "info",

"service": "pageViewService",

"tags":

"source:postman",

"project:test"

],

"timestamp": "2020-10-07T00:01:41.909Z",

"host": "my.sample.host",

"attributes": {

"hostname": "my.sample.host",

"pageViews": "700",

"user": "steve",

"service": "pageViewService"

},

"message": "Sample message"

},

"type": "log",

"id": "AQAAAXUAXRYVJGxvDQAAAABBWFVBWFJZVmd2ZlktbUdUZjRBQQ"

},

{

"attributes": {

"status": "info",

"service": "pageViewService",

"tags":

"source:postman",

"project:test"

],

"timestamp": "2020-10-07T00:01:57.586Z",

"host": "my.sample.host",

"attributes": {

"hostname": "my.sample.host",

"pageViews": "500",

"user": "bob",

"service": "pageViewService"

},

"message": "Sample message"

},

"type": "log",

"id": "AQAAAXUAXVNSvuMvWwAAAABBWFVBWFZOU2I2ZWcxX3c2LVVBQQ"

},

{

"attributes": {

"status": "info",

"service": "pageViewService",

"tags":

"source:postman",

"project:test"

],

"timestamp": "2020-10-07T00:02:33.461Z",

"host": "my.sample.host",

"attributes": {

"hostname": "my.sample.host",

"pageViews": "450",

"user": "chris",

"service": "pageViewService"

},

"message": "Sample message"

},

"type": "log",

"id": "AQAAAXUAXd91M9wyTgAAAABBWFVBWGQ5MVZrQmFzdEZ2TG9BQQ"

}

],

"links": {

"next": "https://api.smsala.com/api/v2/logs/events?sort=timestamp&filter%5Bquery%5D=%2A&page%5Bcursor%5D=eyJhZnRlciI6IkFRQUFBWFVBWGQ5MU05d3lUZ0FBQUFCQldGVkJXR1E1TVZaclFtRnpkRVoyVEc5QlFRIn0&filter%5Bfrom%5D=2020-10-07T00%3A00%3A00%2B00%3A00&filter%5Bto%5D=2020-10-07T00%3A15%3A00%2B00%3A00"

}

 


Log events are retrieved in ascending order based on their timestamp values as depicted in the following response:

 

{

"meta": {

"page": {

"after": "eyJhZnRlciI6IkFRQUFBWFVBWGQ5MU05d3lUZ0FBQUFCQldGVkJXR1E1TVZaclFtRnpkRVoyVEc5QlFRIn0"

}

},

"data":

{

"attributes": {

"status": "info",

"service": "pageViewService",

"tags":

"source:postman",

"project:test"

],

"timestamp": "2020-10-07T00:01:41.909Z",

"host": "my.sample.host",

"attributes": {

"hostname": "my.sample.host",

"pageViews": "700",

"user": "steve",

"service": "pageViewService"

},

"message": "Sample message"

},

"type": "log",

"id": "AQAAAXUAXRYVJGxvDQAAAABBWFVBWFJZVmd2ZlktbUdUZjRBQQ"

},

{

"attributes": {

"status": "info",

"service": "pageViewService",

"tags":

"source:postman",

"project:test"

],

"timestamp": "2020-10-07T00:01:57.586Z",

"host": "my.sample.host",

"attributes": {

"hostname": "my.sample.host",

"pageViews": "500",

"user": "bob",

"service": "pageViewService"

},

"message": "Sample message"

},

"type": "log",

"id": "AQAAAXUAXVNSvuMvWwAAAABBWFVBWFZOU2I2ZWcxX3c2LVVBQQ"

},

{

"attributes": {

"status": "info",

"service": "pageViewService",

"tags":

"source:postman",

"project:test"

],

"timestamp": "2020-10-07T00:02:33.461Z",

"host": "my.sample.host",

"attributes": {

"hostname": "my.sample.host",

"pageViews": "450",

"user": "chris",

"service": "pageViewService"

},

"message": "Sample message"

},

"type": "log",

"id": "AQAAAXUAXd91M9wyTgAAAABBWFVBWGQ5MVZrQmFzdEZ2TG9BQQ"

}

],

"links": {

"next": "https://api.smsala.com/api/v2/logs/events?sort=timestamp&filter%5Bquery%5D=%2A&page%5Bcursor%5D=eyJhZnRlciI6IkFRQUFBWFVBWGQ5MU05d3lUZ0FBQUFCQldGVkJXR1E1TVZaclFtRnpkRVoyVEc5QlFRIn0&filter%5Bfrom%5D=2020-10-07T00%3A00%3A00%2B00%3A00&filter%5Bto%5D=2020-10-07T00%3A15%3A00%2B00%3A00"

}

 

Hi,

 

How log response from sms API (smsala) would help getting logs from Celonis EMS data pool?

Additionally that's wrong advice as both internal and non-internal Celonis API doesn't use "next page links", which can be misleading. It's important because there are many pagination methods available.

 

General remark: a lot of API has different response schemas, so you usually if you will use whole reponse schema, it won't work because of the differences.

 

Best Regards,

Mateusz Dudek


Reply