Hi Team,
Im trying to convert the data i need from the datapool into a pandas dataframe for further calculation.
In the documentation i could only find how to read the parquet files and convert into Pandas.
Is there any way?
Let me know
Thanks
Hi Team,
Im trying to convert the data i need from the datapool into a pandas dataframe for further calculation.
In the documentation i could only find how to read the parquet files and convert into Pandas.
Is there any way?
Let me know
Thanks
Hello,
We pull data from a the data model in a data pool. The following code works in pycelonis 2.0 or higher. The output is a pandas data frame, the input is a data model object with the table that you want to export and the tablename.
Hint: You have to execute a query over the data model to extract (export) data, so we are dinamically creating the SQL with a "for" statement. This helps in case we change the table definition so we don't need to change the script.
def getDataFrameFromDataModel(dataModel, tableName):
dataPoolTable = dataModel.get_tables().find(tableName)
from pycelonis.pql import PQL, PQLColumn, PQLFilter
query = PQL()
cel_name = ""
for colx in dataPoolTable.get_columns():
col = colx.name
name = f'{col}'
cel_name = f'"{tableName}"."{col}"'
query += PQLColumn(query=cel_name , name=name)
# print(query)
data_frame = dataModel.export_data_frame(query)
data_framea'source'] = tableName
# print(data_frame.head())
return data_frame
We are using a global constant for the data model name. The code below pulls the data model:
dataModelName = DATA_MODEL
dataModel = dataPool.get_data_models().find(dataModelName)
This is the sample in API documentation that we use as base for the code above:
Regards, Esteban.
Hi @esteban.leiva12 ,
Thank you for your reply.
I tried using this, but i want to get the data from the connections/files.
For ex, i have uploaded an excel via the upload files option into the connection.
I need to access the data from that excel as dataframes in ML, is there any such option?
Please let me know.
Thank you
Hi @Rahul Bhat , I don't really know. It seems like trying to get data from the data source.
The file that you uploaded created a table in the data pool. You can create a dummy data model with that table created with the excel file. This would work.
I tried to pull data from tables in the data pool but I didn't find any option for that. As far as I know, the only way to extract data is via query (export) to the data model.
Hi @Rahul Bhat , I don't really know. It seems like trying to get data from the data source.
The file that you uploaded created a table in the data pool. You can create a dummy data model with that table created with the excel file. This would work.
I tried to pull data from tables in the data pool but I didn't find any option for that. As far as I know, the only way to extract data is via query (export) to the data model.
Hi @Rahul Bhat & @esteban.leiva12,
This is indeed the way to go. Since the PyCelonis API v.2+, only data from data models can be retrieved, previously this was also possible from transformations, but only for a limited number of rows.
Best regards,
Jan-peter
Hi @Rahul Bhat & @esteban.leiva12,
This is indeed the way to go. Since the PyCelonis API v.2+, only data from data models can be retrieved, previously this was also possible from transformations, but only for a limited number of rows.
Best regards,
Jan-peter
Hi @janpeter.van.d @esteban.leiva12
Thank you for your comments.
Yes i was under the impression that the data could still be taken directly from the datapool, i was not aware we cant do that any more.
I will try taking data from datamodel and then insert it back to the table in the datamodel.
@esteban.leiva12 The code is still a bit confusing for me and its giving out a few errors. Il try solving it and if not i might drop another comment for your help.
Thank you
Found the answer
columns=y.get_columns()
from pycelonis.pql import PQL,PQLColumn,PQLFilter,OrderByColumn
query= PQL()
for column in columns:
query.add(PQLColumn(name=column.name,query=f""" "{y.name}"."{column.name}" """))
df=h.export_data_frame(query)
Found the answer
columns=y.get_columns()
from pycelonis.pql import PQL,PQLColumn,PQLFilter,OrderByColumn
query= PQL()
for column in columns:
query.add(PQLColumn(name=column.name,query=f""" "{y.name}"."{column.name}" """))
df=h.export_data_frame(query)
What is 'h' in last line(df=h.export_data_frame(query)) ?
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.