Using linear Regression to forecast


#1

Hello,

We have data like Document creation date(Month) and volume of documents in each month.

Month | Volume

June 100
July 150
August 70

I am using linear regression function to anticipate volumes per month and it seems to be giving right results (slope/intercept). This gives relevant data only for available Document creation date(Month) in the data i.e up to August.
Now I would like to forecast for next 3 months which is at present not showing. Can someone pls help me if we can forecast upcoming months volume using celonis standard functions and without R/Python support?


#3

Dear Amogh,

you need to create a series of values including the next three upcoming months first.
You can do this by using the RANGE(value1, interval, count) - operator.

Range-Operator

Use this series as input column for your trained linear regression model:

linear%20regression

Best regards,
Kerstin


#4

Thank you Kerstin, your input was really helpful.
Is there any chance I can plot my X axis in conjunction with ROUND_MONTH(Date) in my data and forecast date range together?
something like this…
image


#5

Any solution or workaround? :worried:


#8

Dear Sapre,

unfortunately this is not possible within the current Celonis version.
To display both, the old values and the forecasted values in one chart, it would be required to append the range values to the dimension column on which you forecasted the values.

Best regards,
Kerstin


#9

Hello, Linear Regression doesn’t work for me. If I use this exact same formula, it returns me chart with same values in each month and no predicted months. Can you please explain this?

Thanks and Best Regards,
Vaclav


#14

Hello Vaclav,

is this still current? If yes, could you check whether the function without visualisation does work for you? Do you have differing values as function input then the 1.285.129 here?

Best,
Henry


#15

Hello Henry,

yes my input is of different values then 1,285,129. It draws me line (regression) of months that are in the past, but it doesn’t draw future months. And When I use formula that has been discussed in this topic, it changes the values only to one in all months.

Best Regards,
Vaclav


#16

Hello Vaclav,

I tried it out and it worked just fine on demo data. I think it would make sense to raise a ticket with the servicedesk so they can assess if this is a bug.

servicedesk@celonis.com

Best,
Henry


#17

Hello,

can you please show me how it looks in your celonis? PQL Code and Graphical view?

Best Regards,
Vaclav


#18

Hi Vaclav,

Dimension:

ROUND_MONTH(RANGE(ROUND_MONTH(ADD_MONTHS(TODAY(),1)),‘1M’,5))

KPI:

LINEAR_REGRESSION(
TRAIN_LM(
INPUT(ROUND_MONTH("_CEL_P2P_ACTIVITIES".“EVENTTIME”)),
OUTPUT(COUNT_TABLE(“EKPO”))
),
PREDICT(ROUND_MONTH(RANGE(ROUND_MONTH(ADD_MONTHS(TODAY(),1)),‘1M’,5)))
)

Graphical Output:

I know this looks weird but my last actual dates are in 2010 and the trend is negative thus a forecast of negative cases is normal.

Hope this helps!

Best,
Henry


#19

Thanks Henry,

it works now. I have even visualised past and future which I was aiming for. Looks better now.

.

Do you know if there is or will be possible to somehow check Linear regression if it is statistically significant or usable? I mean to see Residuals, Significance, correlation. I mean like in R for example. To verify if result is ok.

Best Regards,
Vaclav.


#20

Hello Vaclav,

currently this is not possible but our development team is working on it.

Best,
Henry


#21

Cool, how did you do the combination of past and future data?


#22

Hi Johannes,

sorry for my late answer, I do not visit here often. Try this:

Dimension:
ROUND_MONTH(RANGE(ROUND_MONTH(ADD_MONTHS(TODAY(),-10)),‘1M’, 25))

KPI:
LINEAR_REGRESSION(
TRAIN_LM(
INPUT(ROUND_MONTH(PU_MIN(“EKPO”, “_CEL_P2P_ACTIVITIES”.“EVENTTIME”, “_CEL_P2P_ACTIVITIES”.“ACTIVITY_EN” = ‘Create Purchase Order Item’))
),
OUTPUT(COUNT_TABLE(“EKPO”))
),
PREDICT(ROUND_MONTH(RANGE(ROUND_MONTH(ADD_MONTHS(TODAY(),-10)),‘1M’,25)))
)

Basically -10 represents how far (in months) I want to go in history, so now it would be from April 2018. and 25 represents how far from april I want to go into future. So 25 months from April would be 2020 June or July. Data fields are from PtP, so use some others from you Data model.

Best Regards,
Vaclav Juren
Deutsche Telekom