Indicating significance levels in the regression output

Scott Rick

Join Date: May 2021
Posts: 242

Indicating significance levels in the regression output

15 Jul 2024, 04:07

Hi. In the code below, I run event study regressions, which generate only coefficients and standard errors and spit this out in an excel sheet. How can this code be modified to include p-values as well? Basically, I'd like the code to identify which coefficients are significant and which are not, either through the * notation used to indicate the p-value, or the actual p-value itself.

Any help will be much appreciated

Code:

*PART 1

capture program drop runEventStudies

program define runEventStudies
    
    args data yList name
    

    use "${outdir}\\`data'.dta", clear
    

Run regressions    - include test rates
    foreach y of local yList {
        
        disp "`y'"
        
        forvalues i = 1/`N' {
            disp "`i'"
            
            * Adding linear, group-specific trend            
            qui areg `y' weeknum treated_trend post treated_post tested i.weeknum ///
                if eventBlock == `i',absorb(state)
            local b_`y'_`i'_b_trend = _b[treated_post]
            local se_`y'_`i'_b_trend = _se[treated_post]    
            
            
            }
        }
    
            

*** Step 2: Construct a dataset saving all coefficients and SEs

*keep Treated_State adopt_week eventBlock cem_varlist 
    qui duplicates drop

    foreach suff in b b_trend {
        
        foreach y of local yList {
                
            qui gen b_`y'_`suff' = ""
            qui gen se_`y'_`suff' = ""
        
            forvalues i = 1/`N' {
                
                qui replace b_`y'_`suff' = "`b_`y'_`i'_`suff''" if eventBlock == `i'
                qui replace se_`y'_`suff' = "`se_`y'_`i'_`suff''" if eventBlock == `i'
                
                }
            
            qui destring b_`y'_`suff', replace
            qui destring se_`y'_`suff', replace
            
            }
        }
 
    drop eventBlock
    
        * Save the data
        foreach x in Treated_State {
        encode `x', gen(temp)
        drop `x'
        rename temp `x'
        }
 
     * save
    order state weeknum cem_varlist 
    sort state weeknum
    save "${coeffdir}\coefficients_`name'.dta", replace

    

end


*PART 2. Run the event studies               

* Acquired    
runEventStudies  "1matched_cohort"                                        /// Data
                `"confirmed_1 recovered_1 deceased_1"'                     /// yList
                 "one"                                                    /// name
                

runEventStudies  "2matched_cohort"                                        /// Data
                `"confirmed_2 recovered_2 deceased_2"'                     /// yList
                 "two"                                                    /// name




PART 3. Calculate averages


local oneVars     "confirmed_1 recovered_1 deceased_1"
local twoVars     "confirmed_2 recovered_2 deceased_2"
local threeVars "confirmed_3 recovered_3 deceased_3"
local fourVars     "confirmed_4 recovered_4 deceased_4"

local sets `" "" "_trend" "'


foreach set of local sets {
    
    
    * open file, write header
    file open EScoeff using "${paperdir}\Tables\EScoeff`set'.csv", write replace


    file write EScoeff "Variable,All states,Early adopters,Late adopters" _n
                
        foreach sample in one two three four {
        
        use "${coeffdir}\coefficients_`sample'.dta", replace

        * Average effect of all states
        foreach var of local `sample'Vars {
            
            * ALL
            qui egen wavg_`var' = wtmean(b_`var'_b`set') ///
                if treated == 1, weight(1/(se_`var'_b`set'^2))
            qui egen wvar_`var' = mean(1/(se_`var'_b`set'^2)) /// SE command
                if treated == 1 
            qui replace wvar_`var' = sqrt(1/wvar_`var') ///follows from 206

            
            * record values
            sum wavg_`var'
            local avgAllStates : display %4.2f r(mean)
            drop wavg_`var'
            sum wvar_`var'
            local sdAllStates : display %4.2f r(mean)
            drop wvar_`var'        
            
            
/*
            file write testtable "`var',`avgAllStates'" _n  
            file write testtable `",="(`sdAllStates')""' _n  
            */

            file write EScoeff "`var',`avgAllStates',`avgEarlyStates',`avgLateStates'" _n  
            file write EScoeff `",="(`sdAllStates')",="(`sdEarlyStates')",="(`sdLateStates')""' _n
            
        }        
        
        }

    file close EScoeff
    
}

Tags: None

Maarten Buis

Join Date: Mar 2014

Posts: 3445
#2

15 Jul 2024, 04:10

So you want to recover the p-value from returned results. See this Stata tip: https://www.stata-journal.com/articl...article=st0137

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
George Ford

Join Date: Aug 2014

Posts: 3135
#3

15 Jul 2024, 07:13

Very useful tip, Maarten. A keeper.

areg returns: r(table)

treated_post it is in column 4 now.

Code:

local b_`y'_`i'_b_trend = r(table)[1,4] local se_`y'_`i'_b_trend = r(table)[2,4] local t_`y'_`i'_b_trend = r(table)[3,4] local p_`y'_`i'_b_trend = r(table)[4,4]

I'd put treated_post as the first regressor and index rtable to [*,1].

This has no real advantage over computed the t/p using Maarten's approach (and indexing by variable name is perhaps safer).
Comment
Scott Rick

Join Date: May 2021

Posts: 242
#4

15 Jul 2024, 09:14

Maarten Buis and George Ford : Thank you. This really helps.

However, in my study, I aggregate the estimates from individual event-study regressions by calculating the average effect of all treatments (or subgroups of treated states with similar characteristics). This is similar to a stacked regression, while leveraging custom weights to calculate more efficient averages. From the article and George's code, I can calculate the p-values for the individual event-study regressions, but I'm not sure how to do that for the stacked regression. Is there something I'm missing here? I'd appreciate any suggestions on this.

Last edited by Scott Rick; 15 Jul 2024, 09:16.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10180
#5

15 Jul 2024, 10:05

You are asking a substantive question about how to obtain p-values from combined models, mixed with a question on how to output estimation results. Start a new thread with the specific question and provide a reproducible example showing at least two results that you need to combine to obtain a p-value.
1 like
Comment

Scott Rick

Join Date: May 2021
Posts: 242

19 Jul 2024, 03:08

Thank you, Andrew Musau . I will do that.

Maarten Buis and George Ford : In my code above, I aggregate the estimates from individual event-study regressions by calculating the average effect of all treatments. On doing so, I get the variables “b_confirmed_1_b” and “se_confirmed_1_b”. Since I am interested in the p-values for the average treatment effects and not the individual event-study regressions, I applied the method highlighted by Maarten on the aggregated coefficients and SE values (highlighted in the code below in red). However, I get the error "invalid 'local'" and am unable to figure out what I am doing wrong here. I'd greatly appreciate any advice.

Code:

  
 Calculate averages  

local oneVars     "confirmed_1 recovered_1 deceased_1"
local twoVars     "confirmed_2 recovered_2 deceased_2"
local threeVars   "confirmed_3 recovered_3 deceased_3"
local fourVars     "confirmed_4 recovered_4 deceased_4"  

local sets `" "" "_trend" "'  

foreach set of local sets {              

* open file, write header
    file open EScoeff using "${paperdir}\Tables\EScoeff`set'.csv", write replace


    file write EScoeff "Variable,All states,Early adopters,Late adopters" _n
                
        foreach sample in one two three four {
        
        use "${coeffdir}\coefficients_`sample'.dta", replace

        * Average effect of all states
        foreach var of local `sample'Vars {
            
            * ALL
            qui egen wavg_`var' = wtmean(b_`var'_b`set') ///
                if treated == 1, weight(1/(se_`var'_b`set'^2))
            qui egen wvar_`var' = mean(1/(se_`var'_b`set'^2)) /// SE command
                if treated == 1 
            qui replace wvar_`var' = sqrt(1/wvar_`var') ///follows from 206
local t = wavg_`var'/wvar_`var'              
di 2*ttail(e(df_r),abs(`t´))              
df = st_numscalar("e(df_r)")            
 t = wavg_`var'/wvar_`var'              
2*ttail(df, abs(t))                          

* record values
            sum wavg_`var'
            local avgAllStates : display %4.2f r(mean)
            drop wavg_`var'
            sum wvar_`var'
            local sdAllStates : display %4.2f r(mean)
            drop wvar_`var'        

          file write EScoeff "`var',`avgAllStates',`avgEarlyStates',`avgLateStates'" _n  
            file write EScoeff `",="(`sdAllStates')",="(`sdEarlyStates')",="(`sdLateStates')""' _n
            
        }        
        
        }

    file close EScoeff
    
}

Last edited by Scott Rick; 19 Jul 2024, 03:15.

Announcement

Indicating significance levels in the regression output

Comment

Comment

Comment

Comment

Comment