Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Alternatives for -fweight- in -xtreg, re-

    Hi everyone,

    I want to run a boostapping procedure for a control function approach to check the robustness of my results. For this I first regress two instruments on my independent variable. I then want to predict the residuals from this procedure and use it in my second stage regression. Full code below.

    My issue: I use random effects for the stage regression as one of the instruments is time-invariant. However, for -xtreg,re- I cannot do use -fweight- as intended.
    weight not allowed with between-effects and random-effects models
    r(101)
    I have tried using -xtreg, mle- as per this thread, however, my code then does not allow to predict the residuals anymore, which is essential for what I want to achieve.
    option res not allowed
    r(198)
    Are there any alternative solution approaches that you could recommend?

    It has been quite a while since I posted, hope I could clarify my issue. If not, please let me know.

    Thanks - looking forward to your replies.


    Code:
    save DataBS_PAT.dta, replace
                
                set seed 1
                capture postclose bs
                    postfile bs repl ExtFounderCEO res using bs, replace
                    forvalues repl = 1/1000 {
                        noi dis "repl: " `repl'
                        clear
                        use "DataBS_PAT.dta"
                        sort Gvkey Year
                        xtset Gvkey Year
                        gen bsweight = .
                        bsample, cluster(Gvkey) weight(bsweight)
                        xtreg ExtFounderCEO ShareExtFounderCEOIndPeers2 KM_EmployerBusinessNewness IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year [fweight = bsweight], re
                        capture drop res
                        predict res, res
                        xtpoisson f.pat_filings ExtFounderCEO IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year res, fe vce(robust)
                        post bs (`repl') (_b[ExtFounderCEO]) (_b[res])
                    }
                    postclose bs1

  • #2
    Expanding the number of observations replicates frequency weights.


    Code:
    sysuse auto, clear
    drop if missing(rep78)
    *FREQ. WEIGHTS
    regress mpg weight [fweight= rep78]
    *EXPAND OBS.
    expand rep78
    regress mpg weight
    Res.:

    Code:
    . sysuse auto, clear
    (1978 Automobile Data)
    
    . drop if missing(rep78)
    (5 observations deleted)
    
    . regress mpg weight [fweight= rep78]
    
          Source |       SS           df       MS      Number of obs   =       235
    -------------+----------------------------------   F(1, 233)       =    397.33
           Model |  5984.38543         1  5984.38543   Prob > F        =    0.0000
        Residual |  3509.34223       233  15.0615546   R-squared       =    0.6304
    -------------+----------------------------------   Adj R-squared   =    0.6288
           Total |  9493.72766       234  40.5714857   Root MSE        =    3.8809
    
    ------------------------------------------------------------------------------
             mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          weight |  -.0062661   .0003144   -19.93   0.000    -.0068854   -.0056468
           _cons |   40.39523   .9585906    42.14   0.000     38.50662    42.28384
    ------------------------------------------------------------------------------
    
    . expand rep78
    (166 observations created)
    
    . regress mpg weight
    
          Source |       SS           df       MS      Number of obs   =       235
    -------------+----------------------------------   F(1, 233)       =    397.33
           Model |  5984.38543         1  5984.38543   Prob > F        =    0.0000
        Residual |  3509.34223       233  15.0615546   R-squared       =    0.6304
    -------------+----------------------------------   Adj R-squared   =    0.6288
           Total |  9493.72766       234  40.5714857   Root MSE        =    3.8809
    
    ------------------------------------------------------------------------------
             mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
          weight |  -.0062661   .0003144   -19.93   0.000    -.0068854   -.0056468
           _cons |   40.39523   .9585906    42.14   0.000     38.50662    42.28384
    ------------------------------------------------------------------------------
    
    .
    Last edited by Andrew Musau; 19 Jul 2023, 03:43.

    Comment


    • #3
      Hey Andrew,

      thanks for the fast reply. However, I am not able to follow what your advice would be - is it to expand my sample?

      As I am working with a large panel data set on firms, I cannot expand it any further. But maybe I misunderstood you. Could you clarify?

      Thanks

      Comment


      • #4
        I have already shown you how to do it in #2, I do not know what is not clear. An alternative is to use mixed to estimate the RE model with frequency weights.

        Code:
        webuse nlswork, clear
        drop if missing(tenure)
        replace tenure= int(tenure) +1
        mixed ln_w grade age [fw= tenure]|| id:, nolog
        xtset id
        expand tenure
        xtreg ln_w grade age, mle nolog
        Res.:

        Code:
        . mixed ln_w grade age [fw= tenure]|| id:, nolog
        
        Mixed-effects ML regression                     Number of obs     =    104,243
        Group variable: idcode                          Number of groups  =      4,697
                                                        Obs per group:
                                                                      min =          1
                                                                      avg =       22.2
                                                                      max =        202
                                                        Wald chi2(2)      =   17082.98
        Log likelihood = -5867.8337                     Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
               grade |   .0812847   .0020865    38.96   0.000     .0771953    .0853741
                 age |   .0191546    .000156   122.75   0.000     .0188488    .0194605
               _cons |   .0899616   .0273471     3.29   0.001     .0363622    .1435611
        ------------------------------------------------------------------------------
        
        ------------------------------------------------------------------------------
          Random-effects parameters  |   Estimate   Std. err.     [95% conf. interval]
        -----------------------------+------------------------------------------------
        idcode: Identity             |
                          var(_cons) |   .1237168   .0027612      .1184216    .1292489
        -----------------------------+------------------------------------------------
                       var(Residual) |   .0564515   .0002531      .0559576    .0569498
        ------------------------------------------------------------------------------
        LR test vs. linear model: chibar2(01) = 1.0e+05       Prob >= chibar2 = 0.0000
        
        .
        . xtset id
               panel variable:  idcode (unbalanced)
        
        .
        . expand tenure
        (76,171 observations created)
        
        .
        . xtreg ln_w grade age, mle nolog
        
        Random-effects ML regression                    Number of obs     =    104,243
        Group variable: idcode                          Number of groups  =      4,697
        
        Random effects u_i ~ Gaussian                   Obs per group:
                                                                      min =          1
                                                                      avg =       22.2
                                                                      max =        202
        
                                                        LR chi2(2)        =   15685.17
        Log likelihood  = -5867.8337                    Prob > chi2       =     0.0000
        
        ------------------------------------------------------------------------------
             ln_wage | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
        -------------+----------------------------------------------------------------
               grade |   .0812847   .0020865    38.96   0.000     .0771953    .0853742
                 age |   .0191546   .0001561   122.72   0.000     .0188487    .0194605
               _cons |   .0899616   .0273473     3.29   0.001      .036362    .1435613
        -------------+----------------------------------------------------------------
            /sigma_u |    .351734   .0039252                      .3441244     .359512
            /sigma_e |   .2375953   .0005326                      .2365536    .2386416
                 rho |   .6866734   .0049146                      .6769782    .6962413
        ------------------------------------------------------------------------------
        LR test of sigma_u=0: chibar2(01) = 1.0e+05            Prob >= chibar2 = 0.000
        
        .

        Comment


        • #5
          Originally posted by Lennart Osses View Post
          Hey Andrew,

          thanks for the fast reply. However, I am not able to follow what your advice would be - is it to expand my sample?

          As I am working with a large panel data set on firms, I cannot expand it any further. But maybe I misunderstood you. Could you clarify?
          If you have frequency weights, that means you have observations that have been "compressed" in the sense that multiple copies of the same record exist. For example, rather than having 5 observations with exactly the same values, you have only 1 record and a new variable (call it frequency) with a value of 5 -- this is your frequency weight. It is just a "stand in" for how many copies exist for each observation. Andrew showed you how to -expand- the data, which does the work of duplicating records.

          I see that you are generating frequency weights after generating a cluster bootstrap sample. This is one way -- you would then need to -expand- and then renumber firms so that you properly account for the re-sampling procedure. You could also just ask -bsample- to create the new identifers for you and forgo the frequency weight. Below are both approaches, illustrated. You can adapt for your own situation.

          Code:
          webuse nlswork, clear
          xtset idcode year
          
          * expansion approach
          gen fw = .
          bsample , cluster(idcode) weight(fw)
          keep if fw>0
          expand fw
          bys idcode year : gen `c(obs_t)' sampnum = _n
          bys idcode sampnum (year) : gen `c(obs_t)' newid = 1 if _n==1
          replace newid = sum(newid)
          drop sampnum
          
          xtset newid year
          xtreg ln_w grade age ttl_exp tenure, fe
          
          * direct approach with bsample, idcluster()
          bsample, cluster(idcode) idcluster(newid)
          
          xtset newid year
          xtreg ln_w grade age ttl_exp tenure, fe

          Comment


          • #6
            Hi,

            thanks Leonardo - your reply was super helpful to understand what -fweight- is actually doing. I have tried both the versions that you proposed. However, for the "manual" option it appears that my code does not generate enough observations for the code to run the regressions (Code1). For the second "automated" I am again not able to save residuals post regression (see initial problem).

            Do you have any idea why this might be the case?

            Thanks!

            Code1:
            Code:
                        save DataBS_PAT.dta, replace
                        
                        set seed 1
                        capture postclose bs
                            postfile bs repl ExtFounderCEO res using bs, replace
                            forvalues repl = 1/1000 {
                                noi dis "repl: " `repl'
                                clear
                                use "DataBS_PAT.dta"
                                sort Gvkey Year
                                xtset Gvkey Year
                                gen bsweight = .
                                bsample, cluster(Gvkey) weight(bsweight)
                                keep if bsweight>0
                                expand bsweight
                                bys Gvkey Year : gen `c(obs_t)' sampnum = _n
                                bys Gvkey Year sampnum (Year) : gen `c(obs_t)' newid = 1 if _n==1
                                replace newid = sum(newid)
                                drop sampnum
                                xtset newid Year                
                                xtreg ExtFounderCEO ShareExtFounderCEOIndPeers2 KM_EmployerBusinessNewness IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year, re
                                capture drop res
                                predict res, res
                                xtpoisson f.pat_filings ExtFounderCEO IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year res, fe vce(robust)
                                post bs (`repl') (_b[ExtFounderCEO]) (_b[res])
                            }
                            postclose bs
            Code2:
            Code:
                    save DataBS_PAT.dta, replace
                        
                        set seed 1
                        capture postclose bs
                            postfile bs repl ExtFounderCEO res using bs, replace
                            forvalues repl = 1/1000 {
                                noi dis "repl: " `repl'
                                clear
                                use "DataBS_PAT.dta"
                                sort Gvkey Year
                                xtset Gvkey Year
                                bsample, cluster(Gvkey) idcluster(newid)
                                xtset newid Year                
                                xtreg ExtFounderCEO ShareExtFounderCEOIndPeers2 KM_EmployerBusinessNewness IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year, re
                                capture drop res
                                predict res, res
                                xtpoisson f.pat_filings ExtFounderCEO IntFounderCEO IndustryPerformance IndustryHHI FirmTobQ FirmSize FirmRD3 CEOAge CEOTenure CEOEdu CEOTecDeg CEOPower i.Year res, fe vce(robust)
                                post bs (`repl') (_b[ExtFounderCEO]) (_b[res])
                            }
                            postclose bs

            Comment


            • #7
              From -help xtreg postestimation-, go to -predict- and see what's available. There doesn't appear to be a -residual- option for predict, but instead other options for residuals. You'll need to determine which you need and make appropriate adjustments to the predict commands.

              Comment

              Working...
              X