Question
Ah! You almost nailed it with the new RANGE_APPEND function!
Dear Celonis team.
First, let me congratulate you for continuously improving this wonderful tool.
RANGE_APPEND was a very much anticipated and need function for me. However It misses one small detail that I was expecting: Gap Filling and Interpolation (GFI), like in the Vertica TIMESERIES Clause (https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/AnalyzingData/TimeSeries/GapFillingAndInterpolationGFI.htm)
The Problem
When showing a series of date related records, lets say the daily account balance of my banking customers, I need to fill not just a date value for each day, but also to assume the same account balance of the previous day (non null record), if the balance hasnt changed on a specific date. RANGE_APPEND does part of it. It fills the gaps in the days that dont exist, but assumes 0 or null on the KPI values for the columns in my data table. This is OK, but not enough, regretfully.
I even tried to use the LAG function, hoping it would match each generated date value of the RANGE_APPEND, with previous non null value of my table. But it didnt.
The Current solution
To solve my problem, which is trivial in time series analysis, I currently have a table, built in the Event Collection, using Verticas TIMESERIES with Constant Interpolation, that has one record per day, per customer, per credit account. As you can imagine, this already has 500.000.000 records just for one year of data. Its becoming unmanageable and taking most of the contracted storage in our IBC licence, that I will need for my other data models.
Please, please, consider enhancing the RANGE_APPEND with a new parameter that would allow us to chose the type of behavior for the values in the other columns of the table: assume 0 or null (current feature), assume last valid value (constant interpolation) or even linear interpolation.
Thank you.
Best regards,
Joaquim
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.
The only thing missing is the assumption of a previous value that is not an aggregation, i.e., the actual value (number, string, date) that lies in the LAG record.
I tried, but couldnt do it. The fictional date created with the RANGE_APPEND always returns a null value for the column Im trying to collect.
Cheers,
Joaquim
) = 220GB. The original table has 68 million records = 18GB.
Since this data is used during the analysis of the process, we would save a lot in storage and processing.
Cheers,
Joaquim