Dear Statalist members,
I’m working with dyadic panel data (buyer–supplier pairs) and studying how buyer growth (COGS) relates to total emissions at the dyad level (buyer + supplier Scope 1+2). I’m trying to decide the most appropriate modeling strategy among three options, and would appreciate your input on the theoretical and statistical validity of each.
Notably, it is important that elasticities can be interpreted as absolutes. I am interested to see how buyer growth affects supplier, with which it has relationship.
My dependent variable is dyadic emissions, and I have three modeling alternatives:
1. Unscaled model (baseline)
Dependent: gen ln_ted = log(scope1_buyer + scope2_buyer +scope1_supplier +scope2_supplier)
Independent: ln_cogs_buyer (not scaled), I use ln_cogs_supplier as control.
Used when interpreting the influence of full buyer activity on dyadic emissions.
Full model in STATA:
reghdfe ln_ted ln_cogsb cap_intensityb rd_intensityb leverageb_w ihs_ebitb env_scm_numb i.limit_net cap_intensitys rd_intensitys leverages_w ihs_ebits ln_cogss ln_rel_size_w ln_size_diff i.same_cdp sample_selection supplier_selection_bias , absorb(pair_id year) vce(cluster pair_id isins isinb)
2. Scaled dependent only
Dependent: ln_sted = log((scope1_buyer + scope2_buyer) * cost_share + (scope1_supplier + scope2_supplier) * rev_share)
Independent: ln_cogs_buyer (not scaled), ln_cogs_supplier (unscaled firm-level COGS)
Captures relationship-weighted emissions, but not scaled growth.
3. Scaled dependent + scaled independent (fully dyadic)
Dependent: ln_sted (same as above)
Independent: ln_cogsb_scaled = log(COGSb * cost_share) and ln_cogss_scaled = log(COGSs * rev_share)
Both emissions and activity are scaled to reflect the scope of the relationship.
My questions:
1. Which of these three is most appropriate to study the effect of buyer growth?
2. Does using total COGS while scaling emissions (model 2) introduce interpretive inconsistencies?
3. Should scaling apply only to the dependent variable, or must it be symmetric?
I’d be happy to share estimation outputs if helpful.
Many thanks in advance!
Best,
Iryna
I’m working with dyadic panel data (buyer–supplier pairs) and studying how buyer growth (COGS) relates to total emissions at the dyad level (buyer + supplier Scope 1+2). I’m trying to decide the most appropriate modeling strategy among three options, and would appreciate your input on the theoretical and statistical validity of each.
Notably, it is important that elasticities can be interpreted as absolutes. I am interested to see how buyer growth affects supplier, with which it has relationship.
My dependent variable is dyadic emissions, and I have three modeling alternatives:
1. Unscaled model (baseline)
Dependent: gen ln_ted = log(scope1_buyer + scope2_buyer +scope1_supplier +scope2_supplier)
Independent: ln_cogs_buyer (not scaled), I use ln_cogs_supplier as control.
Used when interpreting the influence of full buyer activity on dyadic emissions.
Full model in STATA:
reghdfe ln_ted ln_cogsb cap_intensityb rd_intensityb leverageb_w ihs_ebitb env_scm_numb i.limit_net cap_intensitys rd_intensitys leverages_w ihs_ebits ln_cogss ln_rel_size_w ln_size_diff i.same_cdp sample_selection supplier_selection_bias , absorb(pair_id year) vce(cluster pair_id isins isinb)
2. Scaled dependent only
Dependent: ln_sted = log((scope1_buyer + scope2_buyer) * cost_share + (scope1_supplier + scope2_supplier) * rev_share)
Independent: ln_cogs_buyer (not scaled), ln_cogs_supplier (unscaled firm-level COGS)
Captures relationship-weighted emissions, but not scaled growth.
3. Scaled dependent + scaled independent (fully dyadic)
Dependent: ln_sted (same as above)
Independent: ln_cogsb_scaled = log(COGSb * cost_share) and ln_cogss_scaled = log(COGSs * rev_share)
Both emissions and activity are scaled to reflect the scope of the relationship.
My questions:
1. Which of these three is most appropriate to study the effect of buyer growth?
2. Does using total COGS while scaling emissions (model 2) introduce interpretive inconsistencies?
3. Should scaling apply only to the dependent variable, or must it be symmetric?
I’d be happy to share estimation outputs if helpful.
Many thanks in advance!
Best,
Iryna