Alternatives for -fweight- in -xtreg, re-

Lennart Osses

Join Date: Oct 2019
Posts: 11

Alternatives for -fweight- in -xtreg, re-

19 Jul 2023, 02:40

Hi everyone,

I want to run a boostapping procedure for a control function approach to check the robustness of my results. For this I first regress two instruments on my independent variable. I then want to predict the residuals from this procedure and use it in my second stage regression. Full code below.

My issue: I use random effects for the stage regression as one of the instruments is time-invariant. However, for -xtreg,re- I cannot do use -fweight- as intended.

weight not allowed with between-effects and random-effects models
r(101)

I have tried using -xtreg, mle- as per this thread, however, my code then does not allow to predict the residuals anymore, which is essential for what I want to achieve.

option res not allowed
r(198)

Are there any alternative solution approaches that you could recommend?

It has been quite a while since I posted, hope I could clarify my issue. If not, please let me know.

Thanks - looking forward to your replies.

Code:

save DataBS_PAT.dta, replace
            
            set seed 1
            capture postclose bs
                postfile bs repl ExtFounderCEO res using bs, replace
                forvalues repl = 1/1000 {
                    noi dis "repl: " `repl'
                    clear
                    use "DataBS_PAT.dta"
                    sort Gvkey Year
                    xtset Gvkey Year
                    gen bsweight = .
                    bsample, cluster(Gvkey) weight(bsweight)
                    xtreg ExtFounderCEO ShareExtFounderCEOIndPeers2 KM_EmployerBusinessNewness IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year [fweight = bsweight], re
                    capture drop res
                    predict res, res
                    xtpoisson f.pat_filings ExtFounderCEO IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year res, fe vce(robust)
                    post bs (`repl') (_b[ExtFounderCEO]) (_b[res])
                }
                postclose bs1

Tags: boostrapping, fweight, random effects, xtreg

Andrew Musau

Join Date: Oct 2014
Posts: 10281

19 Jul 2023, 03:36

Expanding the number of observations replicates frequency weights.

Code:

sysuse auto, clear
drop if missing(rep78)
*FREQ. WEIGHTS
regress mpg weight [fweight= rep78]
*EXPAND OBS.
expand rep78
regress mpg weight

Res.:

Code:

. sysuse auto, clear
(1978 Automobile Data)

. drop if missing(rep78)
(5 observations deleted)

. regress mpg weight [fweight= rep78]

      Source |       SS           df       MS      Number of obs   =       235
-------------+----------------------------------   F(1, 233)       =    397.33
       Model |  5984.38543         1  5984.38543   Prob > F        =    0.0000
    Residual |  3509.34223       233  15.0615546   R-squared       =    0.6304
-------------+----------------------------------   Adj R-squared   =    0.6288
       Total |  9493.72766       234  40.5714857   Root MSE        =    3.8809

------------------------------------------------------------------------------
         mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |  -.0062661   .0003144   -19.93   0.000    -.0068854   -.0056468
       _cons |   40.39523   .9585906    42.14   0.000     38.50662    42.28384
------------------------------------------------------------------------------

. expand rep78
(166 observations created)

. regress mpg weight

      Source |       SS           df       MS      Number of obs   =       235
-------------+----------------------------------   F(1, 233)       =    397.33
       Model |  5984.38543         1  5984.38543   Prob > F        =    0.0000
    Residual |  3509.34223       233  15.0615546   R-squared       =    0.6304
-------------+----------------------------------   Adj R-squared   =    0.6288
       Total |  9493.72766       234  40.5714857   Root MSE        =    3.8809

------------------------------------------------------------------------------
         mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |  -.0062661   .0003144   -19.93   0.000    -.0068854   -.0056468
       _cons |   40.39523   .9585906    42.14   0.000     38.50662    42.28384
------------------------------------------------------------------------------

.

Last edited by Andrew Musau; 19 Jul 2023, 03:43.

Comment

Lennart Osses

Join Date: Oct 2019

Posts: 11
#3

19 Jul 2023, 05:57

Hey Andrew,

thanks for the fast reply. However, I am not able to follow what your advice would be - is it to expand my sample?

As I am working with a large panel data set on firms, I cannot expand it any further. But maybe I misunderstood you. Could you clarify?

Thanks
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10281

19 Jul 2023, 06:24

I have already shown you how to do it in #2, I do not know what is not clear. An alternative is to use mixed to estimate the RE model with frequency weights.

Code:

webuse nlswork, clear
drop if missing(tenure)
replace tenure= int(tenure) +1
mixed ln_w grade age [fw= tenure]|| id:, nolog
xtset id
expand tenure
xtreg ln_w grade age, mle nolog

Res.:

Code:

. mixed ln_w grade age [fw= tenure]|| id:, nolog

Mixed-effects ML regression                     Number of obs     =    104,243
Group variable: idcode                          Number of groups  =      4,697
                                                Obs per group:
                                                              min =          1
                                                              avg =       22.2
                                                              max =        202
                                                Wald chi2(2)      =   17082.98
Log likelihood = -5867.8337                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       grade |   .0812847   .0020865    38.96   0.000     .0771953    .0853741
         age |   .0191546    .000156   122.75   0.000     .0188488    .0194605
       _cons |   .0899616   .0273471     3.29   0.001     .0363622    .1435611
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
-----------------------------+------------------------------------------------
idcode: Identity             |
                  var(_cons) |   .1237168   .0027612      .1184216    .1292489
-----------------------------+------------------------------------------------
               var(Residual) |   .0564515   .0002531      .0559576    .0569498
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 1.0e+05       Prob >= chibar2 = 0.0000

.
. xtset id
       panel variable:  idcode (unbalanced)

.
. expand tenure
(76,171 observations created)

.
. xtreg ln_w grade age, mle nolog

Random-effects ML regression                    Number of obs     =    104,243
Group variable: idcode                          Number of groups  =      4,697

Random effects u_i ~ Gaussian                   Obs per group:
                                                              min =          1
                                                              avg =       22.2
                                                              max =        202

                                                LR chi2(2)        =   15685.17
Log likelihood  = -5867.8337                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
       grade |   .0812847   .0020865    38.96   0.000     .0771953    .0853742
         age |   .0191546   .0001561   122.72   0.000     .0188487    .0194605
       _cons |   .0899616   .0273473     3.29   0.001      .036362    .1435613
-------------+----------------------------------------------------------------
    /sigma_u |    .351734   .0039252                      .3441244     .359512
    /sigma_e |   .2375953   .0005326                      .2365536    .2386416
         rho |   .6866734   .0049146                      .6769782    .6962413
------------------------------------------------------------------------------
LR test of sigma_u=0: chibar2(01) = 1.0e+05            Prob >= chibar2 = 0.000

.

Comment

Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2406
#5

19 Jul 2023, 08:29

Originally posted by Lennart Osses View Post

Hey Andrew,

thanks for the fast reply. However, I am not able to follow what your advice would be - is it to expand my sample?

As I am working with a large panel data set on firms, I cannot expand it any further. But maybe I misunderstood you. Could you clarify?

If you have frequency weights, that means you have observations that have been "compressed" in the sense that multiple copies of the same record exist. For example, rather than having 5 observations with exactly the same values, you have only 1 record and a new variable (call it frequency) with a value of 5 -- this is your frequency weight. It is just a "stand in" for how many copies exist for each observation. Andrew showed you how to -expand- the data, which does the work of duplicating records.

I see that you are generating frequency weights after generating a cluster bootstrap sample. This is one way -- you would then need to -expand- and then renumber firms so that you properly account for the re-sampling procedure. You could also just ask -bsample- to create the new identifers for you and forgo the frequency weight. Below are both approaches, illustrated. You can adapt for your own situation.

Code:

webuse nlswork, clear xtset idcode year * expansion approach gen fw = . bsample , cluster(idcode) weight(fw) keep if fw>0 expand fw bys idcode year : gen `c(obs_t)' sampnum = _n bys idcode sampnum (year) : gen `c(obs_t)' newid = 1 if _n==1 replace newid = sum(newid) drop sampnum xtset newid year xtreg ln_w grade age ttl_exp tenure, fe * direct approach with bsample, idcluster() bsample, cluster(idcode) idcluster(newid) xtset newid year xtreg ln_w grade age ttl_exp tenure, fe
2 likes
Comment

Lennart Osses

Join Date: Oct 2019
Posts: 11

20 Jul 2023, 02:39

Hi,

thanks Leonardo - your reply was super helpful to understand what -fweight- is actually doing. I have tried both the versions that you proposed. However, for the "manual" option it appears that my code does not generate enough observations for the code to run the regressions (Code1). For the second "automated" I am again not able to save residuals post regression (see initial problem).

Do you have any idea why this might be the case?

Thanks!

Code1:

Code:

            save DataBS_PAT.dta, replace
            
            set seed 1
            capture postclose bs
                postfile bs repl ExtFounderCEO res using bs, replace
                forvalues repl = 1/1000 {
                    noi dis "repl: " `repl'
                    clear
                    use "DataBS_PAT.dta"
                    sort Gvkey Year
                    xtset Gvkey Year
                    gen bsweight = .
                    bsample, cluster(Gvkey) weight(bsweight)
                    keep if bsweight>0
                    expand bsweight
                    bys Gvkey Year : gen `c(obs_t)' sampnum = _n
                    bys Gvkey Year sampnum (Year) : gen `c(obs_t)' newid = 1 if _n==1
                    replace newid = sum(newid)
                    drop sampnum
                    xtset newid Year                
                    xtreg ExtFounderCEO ShareExtFounderCEOIndPeers2 KM_EmployerBusinessNewness IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year, re
                    capture drop res
                    predict res, res
                    xtpoisson f.pat_filings ExtFounderCEO IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year res, fe vce(robust)
                    post bs (`repl') (_b[ExtFounderCEO]) (_b[res])
                }
                postclose bs

Code2:

Code:

        save DataBS_PAT.dta, replace
            
            set seed 1
            capture postclose bs
                postfile bs repl ExtFounderCEO res using bs, replace
                forvalues repl = 1/1000 {
                    noi dis "repl: " `repl'
                    clear
                    use "DataBS_PAT.dta"
                    sort Gvkey Year
                    xtset Gvkey Year
                    bsample, cluster(Gvkey) idcluster(newid)
                    xtset newid Year                
                    xtreg ExtFounderCEO ShareExtFounderCEOIndPeers2 KM_EmployerBusinessNewness IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year, re
                    capture drop res
                    predict res, res
                    xtpoisson f.pat_filings ExtFounderCEO IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year res, fe vce(robust)
                    post bs (`repl') (_b[ExtFounderCEO]) (_b[res])
                }
                postclose bs

Comment

Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2406
#7

20 Jul 2023, 07:13

From -help xtreg postestimation-, go to -predict- and see what's available. There doesn't appear to be a -residual- option for predict, but instead other options for residuals. You'll need to determine which you need and make appropriate adjustments to the predict commands.
Comment

Announcement

Alternatives for -fweight- in -xtreg, re-

Comment

Comment

Comment

Comment

Comment

Comment