Skip to main content

I want to understand the rationale behind creation of the automatic data extraction field: "_CELONIS_CHANGE_DATE"

Hi Anjith,

such field is useful when you want to know when a specific data was extracted or generated. You can then show it on screen in an analysis. For example, I use it in a duplicate checker analysis to inform the end users when the last Machine Learning job last ran so that they can better understand the results. It also is a good marker to inform if the job went into error, as if the timestamp doesn't change, then the data engineer needs to be informed.

 

If you have delta extraction and delta transformations, it enables to know what happened and when if ever something needs to be investigated. I had the case where an invoice status was not updated in Celonis while it was in SAP. I could identify why this change was not reaching Celonis and correct the extraction.

 

Field is based on date and time at moment of running. It is automatically added to the extracted tables, and is an additional field in the transformation activities to build the Activity table.

 

kind regards

Marc


Hi Anjith,

such field is useful when you want to know when a specific data was extracted or generated. You can then show it on screen in an analysis. For example, I use it in a duplicate checker analysis to inform the end users when the last Machine Learning job last ran so that they can better understand the results. It also is a good marker to inform if the job went into error, as if the timestamp doesn't change, then the data engineer needs to be informed.

 

If you have delta extraction and delta transformations, it enables to know what happened and when if ever something needs to be investigated. I had the case where an invoice status was not updated in Celonis while it was in SAP. I could identify why this change was not reaching Celonis and correct the extraction.

 

Field is based on date and time at moment of running. It is automatically added to the extracted tables, and is an additional field in the transformation activities to build the Activity table.

 

kind regards

Marc

Hi Marc,

The problem that I am facing is that even for Full extracts which were completed on the same day, I am receiving multiple dates within the field. I want to understand what is the logic for these multiple dates.


Hi Anjith,

I have checked on daily full extraction job related tables and see indeed that some tables may have different timestamps although one would expect a single one because of the full extraction. It looks like this occurs for huge tables.

_CEL_CHANGE_DATEI did a very quick cross check and didn't find duplicates though in those tables. And when I look at the download log of for example table CDHDR, it shows the same figure as the sum of the two sets of entries with different timestamps:

CDHDR_fullUnfortunately, I don't know why the same timestamp is not returned although it shows that all lines were extracted on that day... sorry for that. Yet in my daily activities as a Celonaut I don't really use the timestamp of full extractions because if something goes wrong with an extraction, this is clearly returned by the system.

 

kind regards

Marc

 

 


Hi Anjith,

I have checked on daily full extraction job related tables and see indeed that some tables may have different timestamps although one would expect a single one because of the full extraction. It looks like this occurs for huge tables.

_CEL_CHANGE_DATEI did a very quick cross check and didn't find duplicates though in those tables. And when I look at the download log of for example table CDHDR, it shows the same figure as the sum of the two sets of entries with different timestamps:

CDHDR_fullUnfortunately, I don't know why the same timestamp is not returned although it shows that all lines were extracted on that day... sorry for that. Yet in my daily activities as a Celonaut I don't really use the timestamp of full extractions because if something goes wrong with an extraction, this is clearly returned by the system.

 

kind regards

Marc

 

 

Oh, I think it is a kind of bug in the finalizing of the extraction job itself: when I look at the different dates in table CDPOS, it shows extractions on May 29 and 30 which are last Saturday and Sunday. Yet, I don't have any schedule running on the weekend... For the P2P process, my schedule is daily at 17:00 on week days (MON to FRI).


Oh, I think it is a kind of bug in the finalizing of the extraction job itself: when I look at the different dates in table CDPOS, it shows extractions on May 29 and 30 which are last Saturday and Sunday. Yet, I don't have any schedule running on the weekend... For the P2P process, my schedule is daily at 17:00 on week days (MON to FRI).

Exactly! I am seeing similar results. I have dates for full extracts going back to even 2020. This seems to be a bug in Celonis extractor and should be notified to the users.


Oh, I think it is a kind of bug in the finalizing of the extraction job itself: when I look at the different dates in table CDPOS, it shows extractions on May 29 and 30 which are last Saturday and Sunday. Yet, I don't have any schedule running on the weekend... For the P2P process, my schedule is daily at 17:00 on week days (MON to FRI).

Hi Anjith,

For your information, I have registered a Support request on the topic.

regards

Marc


Oh, I think it is a kind of bug in the finalizing of the extraction job itself: when I look at the different dates in table CDPOS, it shows extractions on May 29 and 30 which are last Saturday and Sunday. Yet, I don't have any schedule running on the weekend... For the P2P process, my schedule is daily at 17:00 on week days (MON to FRI).

Thanks Marc!! Really appreciate it.. Great convo


Hi Anjith,

I have received an answer to my support ticket.

There seems to be a technical limitation to defining such field when table contains over a million entries. In my case, it concerns the SAP change log and AP document tables. Although the complete extraction is done on the right date, the way that the data is aggregated back causes this hickup and returns different dates with the same time for a single extraction.

 

If the _CELONIS_CHANGE_DATE is of importance for your transformations, you can use the following workaround as very first transformation in your data job:

  • create a single table to register the latest date from one of the impacted table (yyy) . Since you execute full loads, all tables have the same latest date.

DROP TABLE IF EXISTS MAX_DATE;

CREATE TABLE MAX_DATE AS SELECT MAX(_CELONIS_CHANGE_DATE) AS MAX_DATE from yyy;

  • update the _CELONIS_CHANGE_DATE of the related tables (xxx) by replacing the _CELONIS_CHANGE_DATE with the timestamp registered in MAX_DATE. Note that you need this transformation for each impacted table.

UPDATE xxx

SET _CELONIS_CHANGE_DATE = MAX_DATE

FROM MAX_DATE;

 

this will do the job and will be quite fast: on the nearly 9 million records of CDPOS I have, it ran in less than 10 seconds.

kind regards

Marc


Reply