Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TopDown Reconciliation Error Without All Forecasts #290

Open
breadwall opened this issue Oct 3, 2024 · 2 comments
Open

TopDown Reconciliation Error Without All Forecasts #290

breadwall opened this issue Oct 3, 2024 · 2 comments
Assignees
Labels
Milestone

Comments

@breadwall
Copy link

breadwall commented Oct 3, 2024

What happened + What you expected to happen

Reconcile method TopDown with average_proportions appears to require forecasts for all hierarchy levels even though in TopDown you should just need the forecasts at the top and the historical values for all combinations. I tried filling in all missing hierarchies in the Y_hat_df with dummy values like 1, but the top-down forecasts are impacted.

Am I missing something?

Versions / Dependencies

hierarchical_forecast ~ 0.4.2

Reproduction script

import numpy as np
import pandas as pd

from statsforecast.core import StatsForecast
from statsforecast.models import AutoETS

from hierarchicalforecast.core import HierarchicalReconciliation
from hierarchicalforecast.evaluation import HierarchicalEvaluation
from hierarchicalforecast.methods import TopDown

from hierarchicalforecast import utils

# Parameters for dataset creation
n_rows = 1000
date_range = pd.date_range(start='2020-01-01', periods=n_rows, freq='MS')
group_col_one_values = ['A', 'B', 'C']
group_col_two_values = ['X', 'Y', 'Z']

# Create the dataset
data = pd.DataFrame({
    'group_col_one': np.random.choice(group_col_one_values, size=n_rows),
    'group_col_two': np.random.choice(group_col_two_values, size=n_rows),
    'ds': date_range,
    'y': np.random.randint(1, 100, size=n_rows)
})

# Create Top Level to Generate Forecasts
top_data = data.groupby(by=['group_col_one', 'ds'])['y'].sum().reset_index()
Y_top_df, S_top_df, tags_top = utils.aggregate(top_data, [['group_col_one']])

# Create Historical Values of Both 'Top' & 'Bottom'
Y_hist_df, S_hist_df, tags_hist = utils.aggregate(data, [['group_col_one'], ['group_col_one', 'group_col_two']])

# Produce Top Level Forecasts to use for Disaggregation
fcst = StatsForecast(models=[AutoETS(season_length=12)],
                     freq='MS')
Y_hat_df = fcst.forecast(h=12, df=Y_top_df)
reconcilers = [
            TopDown(method='proportion_averages'),
        ]
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df = hrec.reconcile(Y_hat_df, S_hist_df, tags_hist, Y_hist_df)

Issue Severity

None

@breadwall breadwall added the bug label Oct 3, 2024
@elephaint
Copy link
Contributor

THanks for raising the issue; I can reproduce.

I'll have to think about this a bit; I agree with your point that forecasts should only be required for the Top, but the implemented checks prevent that. Before simply bypassing these checks in this case I need to test a bit further if nothing else breaks.

@elephaint elephaint self-assigned this Oct 7, 2024
@christophertitchen
Copy link
Contributor

It is an interesting quirk of the design. I noticed it too but did not really think too much into it because in my current use cases, which do not use this library yet, I generate forecasts for all levels. Of course, I can appreciate that in production, if you and your practitioner(s) decide on a particular single-level approach, there is no need to exhaustively forecast at every level or mess about with reshaping Y_hat_df yourself, so thanks Olivier for looking into it! 👍

The forecasts for average_proportions should be the same regardless of the forecasts of the lower levels though, so that is a worry if you fill them with $1$ and get strange results? Actually, come to think of it, we do not even need the in-sample values of the "middle" levels for this scenario, just the top and bottom levels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants