Hierarchical Sales Target Cascading using Directed Acyclic Graphs (DAGs) in Python

A programmatic guide to reconciling machine learning forecasts with deterministic corporate constraints

If you have ever attempted to apply standard open-source forecasting libraries — like Meta’s Prophet or standard Scikit-Learn regressors — to an Enterprise B2B Sales environment, you likely encountered a structural brick wall quickly.

These powerful statistical aggregators are fundamentally designed for B2C traffic, warehouse inventory, and macroscopic retail movement. They rely on bottom-up historical aggregates to plot a probabilistic forecast.

But Enterprise B2B sales forecasting is fundamentally different. It isn’t just about probability — it is an inherently hierarchical, top-down, and human-biased process.

In this article, we will:

Mathematically illustrate why standard regression libraries struggle with hierarchical constraints.
Build a graph-theory approach using networkx and pandas to cascade targets top-down.
Reconcile behavioral human estimation bias generated in complex corporate hierarchies.
Diagnose pipeline health and intelligently redistribute quotas using zero-sum logic.

1. The Reality of Hierarchical Target Constraints

Standard time-series models assume growth numbers rise purely organically from the bottom up. But in enterprise environments (like SaaS, Consulting, or Manufacturing), the exact opposite occurs.

An overall macro-target (e.g., $100M) is established at the highest level — often derived from separate, high-level financial forecasting at the Board of Directors, CEO, or CRO (Chief Revenue Officer) level. That target must then be logically cascaded down the organizational Directed Acyclic Graph (DAG): Global ➔ Region ➔ VP ➔ Director ➔ Manager ➔ Individual Contributor.

Crucially, at every vertical layer of management, leaders enforce a mathematical Safety Hedge. If a Regional VP is explicitly handed a $10M target, they will not divide exactly $10M among their Directors. To protect against unforeseen underperformance, that VP might mathematically inflate the quota downwards by 5% — officially distributing $10.5M to the layer below them.

*Figure 1: Visualizing the top-down flow of a CEO macro-target through a DAG with a 5% Managerial Hedge*. *(Image by Author)*

This happens recursively at every nodal division. As a string of constraints:

Node(Global) = Target(X)
Sum(SubNodes) = Target(X) × Manager_Hedge_Multiplier
ChildNode(N1) > ChildNode(N2) based on historical capacity.

To properly set enterprise sales targets, the algorithm must cleanly divide a macro target downwards while respecting the historical weight-capacity of the individual leaves (the Sales Reps) and injecting a structural hedge at each layer.

2. The Structural Blindness of Traditional Forecasting Libraries

To understand these limitations mathematically, let’s simulate a hierarchical enterprise spanning three years of historical attainment data.

Generating Topographical Corporate Data

We will synthesize a minimalist organization mapping Global -> Region -> Rep performance over 36 periods.

import pandas as pd
import numpy as np

# A basic hierarchy: Global -> 2 Regions -> 4 Reps
dates = pd.date_range(start='2021-01-01', periods=36, freq='ME')
data = []

reps = {'Rep_A': 'AMER', 'Rep_B': 'AMER', 'Rep_C': 'EMEA', 'Rep_D': 'EMEA'}
base_capacities = {'Rep_A': 10000, 'Rep_B': 5000, 'Rep_C': 7500, 'Rep_D': 6000}

# Simulate historical monthly attainment with a 5% YoY trend
for date in dates:
    trend_multiplier = 1 + (date.year - 2021) * 0.05
    for rep, region in reps.items():
        attainment = base_capacities[rep] * trend_multiplier * np.random.normal(1, 0.1)
        data.append({
            'ds': date,
            'Global': 'Global_Corp',
            'Region': region,
            'Rep': rep,
            'y': round(attainment, 2)
        })

df = pd.DataFrame(data)

Now, assume our deterministic global constraint for the upcoming year is firmly set at $1.5M. Let’s try cascading rep-level targets using a standard regressor like Prophet.

Testing Prophet on Hierarchical Data

Prophet models univariate time series extraordinarily well. But if we try to extract child-node forecasts to logically cascade a hierarchy, the mathematics detach.

from prophet import Prophet

# 1. Forecast the Global node independently
df_global = df.groupby('ds')['y'].sum().reset_index()
m_global = Prophet().fit(df_global)
global_forecast = m_global.predict(m_global.make_future_dataframe(periods=12, freq='ME'))
predicted_global_sum = global_forecast['yhat'][-12:].sum()

# 2. Forecast an individual Rep node independently
df_rep_a = df[df['Rep'] == 'Rep_A'][['ds', 'y']]
m_rep_a = Prophet().fit(df_rep_a)
rep_a_forecast = m_rep_a.predict(m_rep_a.make_future_dataframe(periods=12, freq='ME'))
predicted_rep_a_sum = rep_a_forecast['yhat'][-12:].sum()

print(f"Prophet Predicted Global Total: \${predicted_global_sum:,.2f}")
print(f"Prophet Predicted Rep A Total: \${predicted_rep_a_sum:,.2f}")

The Breakdown: Prophet predicts strictly independent arrays. Nothing stops the independent mathematical sum of Rep_A through Rep_D from aggregating to a fractured number that blatantly violates the $1.5M constraint. Furthermore, time-series libraries possess absolutely no mechanism to inject a 5% mathematical hedge part-way down the structure.

Structural Blindness in Scikit-Learn

If we bypass time-series expectations and use a standard RandomForestRegressor with One-Hot Encoded categorical variables for Region and Rep, the model suffers from structural blindness.

from sklearn.ensemble import RandomForestRegressor
from sklearn.preprocessing import OneHotEncoder

# Flattening the hierarchy via One-Hot Encoding
X = pd.get_dummies(df[['Region', 'Rep', 'ds']])
y = df['y']

rf = RandomForestRegressor()
rf.fit(X, y)

The algorithm maps independent generic features (0 or 1 for regions). It does not natively comprehend the topography of the business — specifically, that Rep_A is structurally nested inside AMER. If AMER drastically misses a quarter, a Random Forest has no mathematical mechanism to absorb that shock and "re-cascade" the lacking quota laterally over to EMEA reps to structurally save the Global Target.

3. A Graph-Theory Approach: The NetworkX Quota Cascader

To natively integrate top-down organization hierarchies, we must physically structure our tabular data into a Directed Acyclic Graph (DAG). Instead of relying on crude aggregations, we can use networkx to calculate the edge weights between nodes based on historical capacity, then traverse the graph to recursively push a macro-target downwards.

Building the Weighted DAG

First, we construct the graph by rolling up the historical attainment for each node, which will serve as our edge weights.

import networkx as nx

# Calculate total historical capacity to weight the edges
capacity_df = df.groupby(['Global', 'Region', 'Rep'])['y'].sum().reset_index()

# Initialize a Directed Graph
G = nx.DiGraph()

for _, row in capacity_df.iterrows():
    # Global to Region Edge
    region_capacity = capacity_df[capacity_df['Region'] == row['Region']]['y'].sum()
    G.add_edge(row['Global'], row['Region'], capacity=region_capacity)
    
    # Region to Rep Edge
    G.add_edge(row['Region'], row['Rep'], capacity=row['y'])

print(f"Nodes in Graph: {G.nodes()}")
# Output: ['Global_Corp', 'AMER', 'EMEA', 'Rep_A', 'Rep_B', 'Rep_C', 'Rep_D']

Formulating the Recursive Quota Cascader

With our business topology correctly established, we can write a recursive function to traverse the internal nodes. At every step down, the function will:

Identify all child nodes.
Determine each child’s percentage share of the parent’s total historical capacity.
Apply the Hedge Multiplier (e.g., 1.05 for +5%).
Allocate the mathematically rigid target to the child, and call itself recursively.

def cascade_target(graph, current_node, current_target, hedge_multiplier=1.05):
    """
    Recursively cascades a target down a Directed Acyclic Graph, 
    accounting for historical edge capacity and injecting a structural hedge.
    """
    allocations = {current_node: current_target}
    children = list(graph.successors(current_node))
    
    if not children:
        return allocations
        
    # Calculate total capacity of immediate children
    total_child_capacity = sum(
        graph.edges[current_node, child]['capacity'] for child in children
    )
    
    # Mathematical Hedge: Inflate the target before passing it down
    hedged_target_to_distribute = current_target * hedge_multiplier
    
    for child in children:
        child_capacity = graph.edges[current_node, child]['capacity']
        weight = child_capacity / total_child_capacity
        child_target = hedged_target_to_distribute * weight
        
        child_allocations = cascade_target(graph, child, child_target, hedge_multiplier)
        allocations.update(child_allocations)
        
    return allocations

# Execute the cascade for our rigid $1.5M constraint
allocated_quotas = cascade_target(G, 'Global_Corp', 1_500_000.0)

for node, quota in allocated_quotas.items():
    print(f"{node}: \${quota:,.2f}")

The Output Mathematics

When we execute the cascader, we witness the power of graph-traversal logic over statistical isolation:

We have mathematically shielded the exact $1.5M Global root node target by systematically distributing $1,653,750 in cumulative quotas across our bottom-most leaf nodes — structurally insulating the system against ~10% compound network failure while perfectly maintaining proportional allocations based on historic variance.

4. Reconciling Human Estimation Bias

Cascading the constraint is only the first step. Throughout a forecasting cycle, corporate leaders rely on manual, qualitative estimations generated by humans — often referred to as “Commits.”

Due to human psychology, independent directors often drastically under-promise (creating hidden upside) or over-promise (creating structural risk). This subjective bias completely bypasses raw Machine Learning baseline logic.

Rather than dismissing human input, we can incorporate it into our algorithms by computing behavioral bias as a mathematical index over a rolling historical horizon.

Bias Index = Historical Actuals / Historical Commit

A Bias Index of 1.25 indicates an operator who systematically under-estimates their capacity (they close 25% more than they estimate). A Bias Index of 0.70 indicates high systemic risk — a manager who chronically over-promises and under-delivers.

# 1. Synthesize a DataFrame mapping historical commitments to actual closures
bias_data = pd.DataFrame({
    'manager_id': ['Manager_A', 'Manager_A', 'Manager_B', 'Manager_B'],
    'historical_commit': [400000, 420000, 500000, 600000],
    'actual_closed': [520000, 505000, 400000, 450000]
})

# 2. Calculate the Managerial Bias Index
bias_metrics = bias_data.groupby('manager_id').sum().reset_index()
bias_metrics['bias_index'] = bias_metrics['actual_closed'] / bias_metrics['historical_commit']

# 3. Algorithmically adjust the current quarter's subjective input
current_forecasts = pd.DataFrame({
    'manager_id': ['Manager_A', 'Manager_B'],
    'subjective_q_commit': [450000, 550000] 
})

reconciled = pd.merge(current_forecasts, bias_metrics[['manager_id', 'bias_index']], on='manager_id')
reconciled['algorithmically_adjusted_commit'] = reconciled['subjective_q_commit'] * reconciled['bias_index']

print(reconciled)

By computationally tracking and explicitly isolating this Bias Index away from raw regression outputs, Data Scientists can present executive leaders with massive operational clarity, effectively neutralizing subjective reporting flaws inside the overarching mathematical forecast.

5. Pipeline-Aware Quota Redistribution

The Quota Cascader distributes targets based on historical capacity — what each rep closed in prior quarters. But there is a second reality that emerges mid-quarter: current pipeline health.

Consider a scenario where Rep_A receives $580K in cascaded quota but has already built $2M in qualified pipeline (3.4× coverage), while Rep_D also receives $336K but only has $150K in pipeline (0.45× coverage). The historical algorithm treated both fairly, but the present reality is dangerously asymmetric.

We can solve this with a post-cascade pipeline health analyzer that diagnoses coverage ratios at every node and optionally redistributes quota among ICs within the same manager’s team.

Defining Coverage Thresholds

Different regions operate under different market dynamics. A conservative AMER sales org may consider 1.5× pipeline coverage “healthy,” while an aggressive APAC team may require 3× to feel comfortable. We define this as a threshold configuration:

coverage_thresholds = {
    'AMER':     {'healthy': 1.5, 'at_risk': 0.8},    # Mature market, lower bar
    'EMEA':     {'healthy': 2.5, 'at_risk': 1.2},    # Mid-bar
    'APAC':     {'healthy': 3.0, 'at_risk': 1.5},    # Highest bar
    '_default': {'healthy': 2.0, 'at_risk': 1.0}     # Fallback for any unspecified node
}

Individual ICs inherit the threshold of their nearest matched ancestor. An IC under APAC automatically uses APAC's threshold, even if they are not explicitly listed in the configuration.

Diagnosing Pipeline Health

With coverage thresholds set, we classify every node in the hierarchy.

Coverage Ratio = Current Pipeline / Cascaded Quota

The following classification rules are applied to each node based on its coverage ratio and inherited threshold:

Using the DAG, we can calculate these metrics recursively — a manager’s pipeline is the aggregate of their leaf-node pipelines, flowing bottom-up through the graph:

import networkx as nx

def diagnose_pipeline(graph, cascaded_quotas, coverage_thresholds):
    """
    Traverses the DAG and classifies each node by pipeline coverage.
    Manager-level pipeline is aggregated bottom-up from IC nodes.
    """
    diagnostics = []
    
    for node in graph.nodes:
        quota = cascaded_quotas.get(node, 0)
        
        # Leaf nodes carry their own pipeline; internal nodes aggregate children
        if graph.out_degree(node) == 0:
            pipeline = graph.nodes[node].get('Current_Pipeline', 0)
        else:
            leaves = [n for n in nx.descendants(graph, node) if graph.out_degree(n) == 0]
            pipeline = sum(graph.nodes[l].get('Current_Pipeline', 0) for l in leaves)
        
        coverage = pipeline / quota if quota > 0 else 0.0
        threshold = resolve_ancestor_threshold(graph, node, coverage_thresholds)
        risk = classify_risk(coverage, threshold)
        # Helper functions defined above — see full implementation on GitHub
        
        diagnostics.append({
            'Node': node, 'Quota': quota, 'Pipeline': pipeline,
            'Coverage': round(coverage, 2), 'Status': risk
        })
    
    return pd.DataFrame(diagnostics)

Zero-Sum Redistribution

When the diagnostic reveals stark coverage imbalances within the same manager’s team, we can redistribute quota from healthy “donors” to at-risk “receivers” — without touching the manager’s total.

The algorithm follows three strict design constraints:

Only IC-level quotas are adjusted. Manager and above quotas are never modified.
Redistribution is zero-sum within each manager’s team: Σ(donor deductions) = Σ(receiver increases).
Safety rails: No individual IC’s quota can change by more than ±20% (configurable via max_adjustment_pct).

def redistribute_within_manager(manager, ics, ic_data, max_pct=0.20):
    """
    Zero-sum redistribution: donors (healthy coverage) give to 
    receivers (at-risk coverage) within the same manager's team.
    """
    donors = {ic: d for ic, d in ic_data.items() if d['coverage'] >= d['threshold']['healthy']}
    receivers = {ic: d for ic, d in ic_data.items() if d['coverage'] < d['threshold']['at_risk']}
    
    if not donors or not receivers:
        return ic_data  # Nothing to redistribute
    
    # Calculate surplus from donors (capped at max_pct of their quota)
    total_surplus = 0
    for ic, d in donors.items():
        excess = (d['coverage'] - d['threshold']['healthy']) / d['coverage']
        contribution = min(d['quota'] * max_pct, d['quota'] * excess)
        d['contribution'] = max(0, contribution)
        total_surplus += d['contribution']
    
    # Distribute to receivers proportionally by need (also capped)
    total_need = sum(
        (d['threshold']['at_risk'] - d['coverage']) / d['threshold']['at_risk'] 
        for d in receivers.values() if d['coverage'] > 0
    )
    
    for ic, d in receivers.items():
        weight = ((d['threshold']['at_risk'] - d['coverage']) / d['threshold']['at_risk']) / total_need
        increase = min(total_surplus * weight, d['quota'] * max_pct)
        d['adjusted_quota'] = d['quota'] + increase
    
    # Verify: Manager total is preserved
    # Σ(adjusted ICs) == Σ(original ICs) ✓

Rep_A has comfortable 2.76× pipeline coverage — they can absorb a slight quota reduction. Rep_B is in critical territory at 0.69× — the additional $24K gives them a statistically better chance of hitting target. And crucially, the AMER manager's total hasn't changed by a single dollar.

Conclusion

In this exploration, we navigated around standard aggregated regressions and instead built a multi-layered Enterprise framework:

Graph Theory Models Reality: By mapping tabular hierarchies into native Directed Acyclic Graphs (networkx), we can mathematically push deterministic constraints downward, respecting both historical capacity weights and intentional safety layers.
Algorithms Must Accommodate Behavioral Bias: Data science in structurally constrained environments isn’t purely statistical. Bridging the gap requires logically indexing human subjectivity inside your baseline predictions.
Pipeline Health Demands Post-Allocation Reconciliation: Static quota assignments based on historical capacity are a starting point, not an endpoint. Mid-quarter pipeline diagnostics with zero-sum redistribution transforms a rigid annual plan into a living, responsive system.
Safety-First Algorithmic Design: Constraints like max adjustment caps (±20%), locked/CRO-mandated nodes, and zero-sum invariants prevent algorithms from creating operational chaos. Enterprise systems require guardrails, not just optimization functions.

Traditional libraries like Prophet remain paramount for macroscopic time-series, but hierarchically rigid constraints require a heavier, graph-integrated physical logic.

The complete underlying engine discussed in this article — including the NetworkX SalesHierarchy, recursive QuotaCascader, CommitReconciler, and PipelineAdjuster objects — is available open-source as the b2b-revenue-forecasting package on GitHub and PyPI.

References

Taylor, S. J., & Letham, B. (2018). Forecasting at Scale (Prophet). The American Statistician, 72(1), 37–45. https://peerj.com/preprints/3190/
Herzen, J., et al. (2022). Darts: User-Friendly Modern Machine Learning for Time Series. Journal of Machine Learning Research. https://unit8co.github.io/darts/
Hagberg, A. A., Schult, D. A., & Swart, P. J. (2008). Exploring Network Structure, Dynamics, and Function using NetworkX. Proceedings of the 7th Python in Science Conference. https://networkx.org/
Athanasopoulos, G., et al. (2009). Hierarchical forecasts for Australian domestic tourism. International Journal of Forecasting, 25(1), 146–166.

Hierarchical Sales Target Cascading using Directed Acyclic Graphs (DAGs) in Python was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.