Event Collection: Slow Extraction Config using Sybase

Dear Celonis Team,

we’re currently working on connecting a Sybase ASE database via the IBC Uplink Connector.
We’ve noticed, that the configuration of the tables in the extraction in some times takes ages.

In the extraction configuration, clicking on a large table with lots of rows the settings pane for that table (like filter settings and column selection) takes minutes to load or doesn’t load at all.

Turning on the Debug Mode for the extraction reveals why that is.
Looking at the log files on the uplink server, the following lines apper once I select a table in the IBC frontend:

2020-07-22 15:33:21.777  INFO 4508 --- [ol-2-thread-105] c.c.c.j.s.DatabaseConnectorService       : Metadata source being used is DRIVER_METADATA --EOM--  
2020-07-22 15:33:21.777  INFO 4508 --- [ol-2-thread-105] c.c.c.j.s.DatabaseConnectorService       : Calling DatabaseMetaDataService#getMetadata for table dbo$_CELONIS_wo_transfer --EOM--  
2020-07-22 15:33:21.777  INFO 4508 --- [ol-2-thread-105] .c.c.j.s.m.DatabaseDriverMetaDataService : Getting metadata for table dbo$_CELONIS_wo_transfer --EOM--  
2020-07-22 15:33:21.777  INFO 4508 --- [ol-2-thread-105] .c.c.j.s.m.DatabaseDriverMetaDataService : Executing getMetadata with the following statement 'SELECT * FROM "dbo"."wo_transfer"' --EOM--
2020-07-22 15:41:17.241  INFO 4508 --- [ol-2-thread-105] c.c.c.j.s.DatabaseConnectorService       : Metadata call finished and table structure retrieved with 44 columns for table dbo$_CELONIS_wo_transfer  --EOM--  
2020-07-22 15:41:17.241  INFO 4508 --- [ol-2-thread-105] c.c.c.j.s.DatabaseConnectorService       : Returning getMetadata response --EOM--  

You can see that the Uplink executes a SELECT * FROM [table], causing a full table scan.
That will of course take time for a table containing >10 Million rows. In this case, the SELECT took 8 Minutes to finish. I wonder why there’s no “LIMIT” (Or TOP)-Clause for this query.

Is there a workaround to avoid such behaviour? Furthermore, is this meta data check also during the actual extraction? Because that would make it pretty much unusable.

Thank you for your help.
Best
Max

Hello Max,

Thank you for raising this issue.

We are already working on an additional feature to allow for explicit limiting to counteract that behavior (see details below). This will require an update of your connector-jdbc.jar file. The new connector-jdbc.jar will be available on the help space beginning of next week.

The root cause of the issue is as following:
The jdbc extractor is using the setFetchSize() method to limit the result set. Though, some jdbc drivers (e.g. Athena, Netezza, …) do not respect the setFetchSize(n) that is sent and instead execute the whole query first (SELECT * FROM TABLE) before getting the n records specified in the setFetchSize. This causes major performance issues.

The new feature will allow you to explicitly provide a limit to the extraction/metadata call and thus improve performance.

I hope this clears up any concerns.

All the best,
Lara

Hi Lara,

thank you for your reply. Let’s see if the new connector version resolves this issue.

Best
Max

Hi Max,

the new version is now available to download on the help page.
You can set the limit within the table configuration for each table or in the extraction setting tab for all tables within the extraction.

Let me know, if we can support you here.

Dear Lara,

thank you for the Fix and the Update. With the new version, the query limitation works and we’re able to extract all the tables.