Skip to main content

Hi all,

 

I have setup an Action Flow that extracts data from a data model containing several fields and filters. The result is used to generate a CSV file.

At the end I get a table containing a lot of duplicates.

I haven't found any setting in Query data nor in CSV Advanced to filter out complete duplicates. I also haven't found any other module I might can use after CSV creation to remove duplicate rows.

 

Do you have any idea how to get rid of duplicates?

 

Thanks,

Dennis

Can't you use DISTINCT in your queries to the process model?

 

That said, it seems you can try this function to remove duplicates in an array

image 

In my experience, the best is to do the maximum of data operations at PQL level, on the queries, and limit the transformations at AF level

 

HTH


Hi Guillermo,

 

your answer helped me to find a way get rid of the duplicates but again it showed me that in my opinion a lot of things are missing within the current Action Flow modules that would make your life much easier. Hope Celonis is working to provide some more features especially in the area of data handling within EMS itself.

 

I have now added two additional modules (After Query data I have added Flow Control Array Aggregator to get an Array and after that I use an Iterator with distinct(array).

Then I can use the output of the Iterator for the CSV Advanced module.

 

Thank you very much.

 

BR

Dennis


Hi Dennis,

 

could you please share at which point in your action flow you have inserted the function? I have the same problem and I'm not quite sure where to use to function, so the duplicates will be deleted.

 

Thank you in advance!

BR

Julia

 


Hi Julia,

 

you can achive this by using the modules Array aggregator and Iterator.

So between Query data module and CSV module you need to add those additional modules.

Within the Array aggregator module you select all the columns you get as output from Query data module to make an array out of it.

And in the Iterator module you just use the following code: {{distinct(10.array)}}

In CSV module you can now use all the single column from Iterator output without having duplicates.

 

But this way is using quite a lot of ressources which leads to long runtimes and depending on the amount of data might lead to a crash of the AF.

 

Hope this helps you.

 

BR

Dennis


Thank you very much, Dennis!

This solved the problem.

 

BR

Julia


Reply