Skip to main content

Hello everyone,

 

I have set up an AF that sends data to our purchasers and also sends the data via a HTTP module to a Webhook:

imageAbout 6.5k data rows will be submitted in the queue of the webhook AF and will be pushed to a table in the data pool every 2 minutes (300 data rows per execution).

At the end of the run of the AF, the table in the data pool does not contain the same amount of data than the data (deviation to 6.5k). Hence, I was guessing that some execution did not work correctly. When looking at the executions, this was confirmed - some executions have this error message:

image 

I tried filtering these executions by adding the BREAK module (with Filter for status code 400 - to stop the execution and store it in incomplete execution) but this does not the desired thing I wanted it to do.

 

Did somebody else have the same issues?

 

Thank you!

 

Best regards,

Julia Bauer

Hi @julia.bauer,

 

My first guess would be that your data contains values/characters that are not possible to transform to Parquet.

For example, null values are allowed in Celonis in string columns, but in Parquet this should be an empty string (" ") (not sure how CSV handles this though).

 

To test what is going on, you could create a temporal flow that sends you the CSVs via mail. In a Python script, you could manually transform the files that failed to parquet. In this way, you will get more information about what the error is.

 

A simple command to do so is:

import pandas as pd

df = pd.read_csv('example.csv')

df.to_parquet('output.parquet')

 

The next step could be to apply functions in your data to replace this empty values, for instance using COALESCE (celonis.com) --> COALESCE("table"."column" , ' ' ).

If there are some characters that are not supported, you could use REMAP_VALUES (celonis.com) to change these.

 

I hope this helps.

 

Best regards,

Jan-peter


Hi @julia.bauer,

 

My first guess would be that your data contains values/characters that are not possible to transform to Parquet.

For example, null values are allowed in Celonis in string columns, but in Parquet this should be an empty string (" ") (not sure how CSV handles this though).

 

To test what is going on, you could create a temporal flow that sends you the CSVs via mail. In a Python script, you could manually transform the files that failed to parquet. In this way, you will get more information about what the error is.

 

A simple command to do so is:

import pandas as pd

df = pd.read_csv('example.csv')

df.to_parquet('output.parquet')

 

The next step could be to apply functions in your data to replace this empty values, for instance using COALESCE (celonis.com) --> COALESCE("table"."column" , ' ' ).

If there are some characters that are not supported, you could use REMAP_VALUES (celonis.com) to change these.

 

I hope this helps.

 

Best regards,

Jan-peter

Hello Jan-Peter,

 

thanks alot for your suggestion!

I contacted the customer support in regards to this issue. They proposed to specify the data types of the columns right as in the standard they are all the type varchar. Do you think that might also be the problem?

I'll take a look at the null values - parquet suggestion, too.

 

Best regards,

Julia Bauer


Reply