Skip to main content

Hi,

 

I need to create a list of all table names and columns in each table that we have in Celonis.

 

I've tried pycelonis, but while I can get the tables the columns list showed as empty when using get_tables(): DataPoolTable(name='LFA1', data_source_id='xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx', columns=s] ,...)

 

I've also tried using sql query:

select table_schema,

table_name

from v_catalog.columns

but got error: "Not allowed to access schema v_catalog"

 

Is there a way to quickly download such list?

 

Thanks,

Karol

First, the SQL for that is:

"select table_name, column_name, data_type, data_type_length  from columns where table_name like <table_name> "

 

image 

Caveat: it will not work for views.

 

Second, with PYCelonis you can iterate all the tables of the datapool, then execute the SQL above changing the table name. The output is a dataframe (I think)

 

New caveat: when you execute a SQL, it will return only 100 lines. For your case use I guess will not be a problem, but... well, shit happens and you can have a table with more than 100 columns

Workaround: use python to create a SQL script that will create a new table, then inserting into it all the output of the columns details of all the tables. Execute the script from the Data Integration. Add the resultant table into a Process Model and either consume it from Celonis Studio, or from python retrieve all the lines using pycelonis.pql -- from pycelonis.pql import PQL, PQLColumn --- the result will be a dataframe that you can export to excel or csv or your preferred format.

 

Alternative: you can use Data Explorer, although is more "interactive" and I guess what you want is to document quickly all the struct in your data pool

 

HTH


First, the SQL for that is:

"select table_name, column_name, data_type, data_type_length  from columns where table_name like <table_name> "

 

image 

Caveat: it will not work for views.

 

Second, with PYCelonis you can iterate all the tables of the datapool, then execute the SQL above changing the table name. The output is a dataframe (I think)

 

New caveat: when you execute a SQL, it will return only 100 lines. For your case use I guess will not be a problem, but... well, shit happens and you can have a table with more than 100 columns

Workaround: use python to create a SQL script that will create a new table, then inserting into it all the output of the columns details of all the tables. Execute the script from the Data Integration. Add the resultant table into a Process Model and either consume it from Celonis Studio, or from python retrieve all the lines using pycelonis.pql -- from pycelonis.pql import PQL, PQLColumn --- the result will be a dataframe that you can export to excel or csv or your preferred format.

 

Alternative: you can use Data Explorer, although is more "interactive" and I guess what you want is to document quickly all the struct in your data pool

 

HTH

Hi,

 

The SQL query that I've mentioned in the first post was directly from vertica documentation (or vertica related page), but while working with Celonis I do understand that there might be some limitations on what you can access from the Vertica engine perspective. Nevermind that.

 

I guess what you want is to document quickly all the struct in your data pool - that is exactly what I want. I need to build some documentation to figure out what I have inherited when I took over the job and to make my life a little bit easier for the future.

 

But getting to the point.

Thank you for the response, the ideas and the SQL query.

As you've mentioned the SQL query will show only 100 rows and I wanted to avoid re-running it each time for a new table, as we have plenty of them loaded in Celonis.

 

So at the end of the day I've pretty much followed your workaround.

I've created new table in Celonis using your query but without using WHERE to filter the tables as I wanted to have them all in one go. Then I have loaded that new table into data model and using pycelonis downloaded the content into data frame (apparently in order to query the data from Celonis it needs to be loaded into data model).

 

Thanks for the help,

Karol

 

 


Hi,

 

The SQL query that I've mentioned in the first post was directly from vertica documentation (or vertica related page), but while working with Celonis I do understand that there might be some limitations on what you can access from the Vertica engine perspective. Nevermind that.

 

I guess what you want is to document quickly all the struct in your data pool - that is exactly what I want. I need to build some documentation to figure out what I have inherited when I took over the job and to make my life a little bit easier for the future.

 

But getting to the point.

Thank you for the response, the ideas and the SQL query.

As you've mentioned the SQL query will show only 100 rows and I wanted to avoid re-running it each time for a new table, as we have plenty of them loaded in Celonis.

 

So at the end of the day I've pretty much followed your workaround.

I've created new table in Celonis using your query but without using WHERE to filter the tables as I wanted to have them all in one go. Then I have loaded that new table into data model and using pycelonis downloaded the content into data frame (apparently in order to query the data from Celonis it needs to be loaded into data model).

 

Thanks for the help,

Karol

 

 

"that there might be some limitations on what you can access from the Vertica engine perspective"

 

Yep, to know that there are differences between Vertica and the implementation in Celonis is something that should be stated more clear in the training and in the docs... very common (even for me) to forget it....

 

" but without using WHERE to filter the tables as I wanted to have them all in one go"

Yep indeed... Ijust realized that some minutes ago while taking coffee... nice!

 

"Thanks for the help,"

 

My pleasure!


Hi Karol,

 

I was working on doing this as well and found a roundabout way of doing this.

 

In a transformation, create a table containing all columns like this.

 

CREATE TABLE Linsert what you want to name the table] AS

'SELECT *

FROM COLUMNS';

 

Then, load that table into a data model.

 

Using python, you can use the PQL() function and use PQL to query the table you're looking for.

 

This is the Python code I'm using.

 

from pycelonis.pql import PQL, PQLColumn, PQLFilter, OrderByColumn

from pycelonis.ems import ExportType

query = PQL()

query += PQLColumn(name="Table Name", query=""" "All_Columns"."Table_Name" """)

query += PQLColumn(name="Column", query=""" "All_Columns"."column_name" """)

query += PQLColumn(name="Data Type", query=""" "All_Columns"."data_type" """)

data_export = data_model.create_data_export(query=query, export_type=ExportType.CSV)

 

Then use this:

data_export.wait_for_execution()

chunks = data_export.get_chunks()

for i, chunk in enumerate(data_export.get_chunks()):

  with open("Tables in Celonis.csv", "wb") as f:

    f.write(chunk.read())

 

And you'll be able to export a CSV with all the tables and fields. The COLUMNS system table has additional fields, but those are the ones I chose to export.

 

Hope this helps!


Reply