Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stepwise with time & individual/country fixed effects panels with appropriate s.e. adjustment

    I was redirected from Stackoverflow here by Nick Cox with the following question:

    I want to run stepwise on a linear probability model with time and individual fixed effects in a panel dataset but stepwise does not support panels out of the box. The solution is to run xtdata y x, fe
    followed by reg y x, r . However, the resulting standard errors are too small. One attempt to solve this problem can be found here: http://www.stata.com/statalist/archi.../msg00629.html but my panel is highly unbalanced (I have a different number of observations for different variables). I also don't understand how stepwise would include this information in its iterations with different variable lists. Since stepwise bases its decision rules on the pvals, this is quite crucial.

    Reproducible Example:

    Code:
    webuse nlswork, clear /*unbalancing a bit:*/
    replace tenure =. if year <73
    replace hours =. if hours==24 | hours==38
    replace tenure =. if idcode==3 | idcode == 12 | idcode == 19
    
    xtreg ln_wage tenure hours union i.year, fe vce(robust) eststo xtregit /*this is what I want to reproduce without xtreg to use it in stepwise */
    xi i.year /* year dummies to keep */
    xtdata ln_wage tenure hours union _Iyear*, fe clear
    reg ln_wage tenure hours union _Iyear*, vce(hc3) /*regression on transformed data */
    eststo regit
    esttab xtregit regit
    As you can see, the estimates are fine but I need to adjust the standard errors. Also, I need to do that in such a way that stepwise understands in its iterations when the number of variables changes, for example. Any help on how to proceed?
    Last edited by Peter Pan; 24 Sep 2015, 13:19.

  • #2
    alternatively, an indication why this question is badly put would be fine... :-) No answers here and on Stackoverflow is a bad sign...

    Comment


    • #3
      Peter Pan:
      the main issue with your query is that most on this list would discourage -stepwise- for any purposes (you can read a lot on this topic by typing -stepwise- in the search window at the top of the screen).
      That said, as a temptative advice, have you investigated the possibility to include bootstrap SEs in your -stepwise- regression?
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        I think Carlo's guess is good here.

        The allusion in #1 was to http://stackoverflow.com/questions/3...ta-with-approp where "Peter Pan" posted as "Jakob".

        Peter/Jakob: We prefer full real names on Statalist. Please see http://www.statalist.org/forums/help#realnames

        Comment


        • #5
          I believe that stepwise keeps track of the number of observations by restricting the sample to observations present for all variables (listwise omission), and so it doesn't matter if you have a different number of observations for different variables; with stepwise, you're going to get the same intersection subset no matter what you do. I believe that's why the helpfile indicates that [if] is required for user-written commands to be used with stepwise: it behaves as if it first fits with all variables (regardless whether forward selection, backward elimination or bidirectional selection) and then uses if e(sample) thereafter.

          Speaking of user-written commands, I'm sure that StataCorp has a good reason for not allowing stepwise to work with panel-data estimation commands out of the box, but can't you get around that restriction by writing a simple wrapper?

          .ÿversionÿ14.0

          .ÿ
          .ÿclearÿ*

          .ÿsetÿmoreÿoff

          .ÿ
          .ÿprogramÿdefineÿstepxt,ÿproperties(sw)
          ÿÿ1.ÿÿÿÿÿversionÿ14.0
          ÿÿ2.ÿÿÿÿÿsyntaxÿvarlistÿ[if],ÿi(varname)
          ÿÿ3.ÿ
          .ÿÿÿÿÿxtregÿ`varlist'ÿ`if',ÿi(`i')ÿfe
          ÿÿ4.ÿend

          .ÿ
          .ÿsysuseÿauto
          (1978ÿAutomobileÿData)

          .ÿstepwise,ÿpr(0.15)ÿpe(0.05):ÿ///
          >ÿÿÿÿÿstepxtÿforeignÿpriceÿmpgÿheadroomÿtrunkÿweightÿlengthÿturnÿdisplacementÿgear_ratio,ÿi(rep78)
          note:ÿ5ÿobs.ÿdroppedÿbecauseÿofÿestimability
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿbeginÿwithÿfullÿmodel
          pÿ=ÿ0.8650ÿ>=ÿ0.1500ÿÿremovingÿheadroom
          pÿ=ÿ0.6581ÿ>=ÿ0.1500ÿÿremovingÿtrunk
          pÿ=ÿ0.7541ÿ>=ÿ0.1500ÿÿremovingÿlength
          pÿ=ÿ0.2313ÿ>=ÿ0.1500ÿÿremovingÿdisplacement

          Fixed-effectsÿ(within)ÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ69
          Groupÿvariable:ÿrep78ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿÿÿ5

          R-sq:ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
          ÿÿÿÿÿwithinÿÿ=ÿ0.6291ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ2
          ÿÿÿÿÿbetweenÿ=ÿ0.9277ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿ13.8
          ÿÿÿÿÿoverallÿ=ÿ0.6920ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿ30

          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(5,59)ÿÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ20.02
          corr(u_i,ÿXb)ÿÿ=ÿ0.4010ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0000

          ------------------------------------------------------------------------------
          ÿÿÿÿÿforeignÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
          -------------+----------------------------------------------------------------
          ÿÿÿÿÿÿÿpriceÿ|ÿÿÿ.0000482ÿÿÿÿ.000013ÿÿÿÿÿ3.72ÿÿÿ0.000ÿÿÿÿÿ.0000223ÿÿÿÿ.0000742
          ÿÿÿÿÿÿÿÿÿmpgÿ|ÿÿÿ-.025849ÿÿÿ.0086659ÿÿÿÿ-2.98ÿÿÿ0.004ÿÿÿÿ-.0431894ÿÿÿ-.0085086
          ÿÿgear_ratioÿ|ÿÿÿ.5158113ÿÿÿÿ.105895ÿÿÿÿÿ4.87ÿÿÿ0.000ÿÿÿÿÿÿ.303916ÿÿÿÿ.7277067
          ÿÿÿÿÿÿÿÿturnÿ|ÿÿ-.0217322ÿÿÿ.0144041ÿÿÿÿ-1.51ÿÿÿ0.137ÿÿÿÿ-.0505548ÿÿÿÿ.0070903
          ÿÿÿÿÿÿweightÿ|ÿÿ-.0002039ÿÿÿ.0001065ÿÿÿÿ-1.92ÿÿÿ0.060ÿÿÿÿ-.0004169ÿÿÿÿ9.08e-06
          ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.4943063ÿÿÿ.6827012ÿÿÿÿÿ0.72ÿÿÿ0.472ÿÿÿÿ-.8717756ÿÿÿÿ1.860388
          -------------+----------------------------------------------------------------
          ÿÿÿÿÿsigma_uÿ|ÿÿ.17638947
          ÿÿÿÿÿsigma_eÿ|ÿÿÿ.2356841
          ÿÿÿÿÿÿÿÿÿrhoÿ|ÿÿ.35902562ÿÿÿ(fractionÿofÿvarianceÿdueÿtoÿu_i)
          ------------------------------------------------------------------------------
          Fÿtestÿthatÿallÿu_i=0:ÿF(4,ÿ59)ÿ=ÿ4.63ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿ=ÿ0.0026

          .ÿ
          .ÿstepwise,ÿpr(0.15)ÿpe(0.05)ÿforward:ÿ///
          >ÿÿÿÿÿstepxtÿforeignÿpriceÿmpgÿheadroomÿtrunkÿweightÿlengthÿturnÿdisplacementÿgear_ratio,ÿi(rep78)
          note:ÿ5ÿobs.ÿdroppedÿbecauseÿofÿestimability
          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿbeginÿwithÿemptyÿmodel
          pÿ=ÿ0.0000ÿ<ÿÿ0.0500ÿÿaddingÿÿÿgear_ratio
          pÿ=ÿ0.0012ÿ<ÿÿ0.0500ÿÿaddingÿÿÿprice
          pÿ=ÿ0.0113ÿ<ÿÿ0.0500ÿÿaddingÿÿÿweight
          pÿ=ÿ0.0077ÿ<ÿÿ0.0500ÿÿaddingÿÿÿmpg

          Fixed-effectsÿ(within)ÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿÿ69
          Groupÿvariable:ÿrep78ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿÿÿ5

          R-sq:ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
          ÿÿÿÿÿwithinÿÿ=ÿ0.6148ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ2
          ÿÿÿÿÿbetweenÿ=ÿ0.9548ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿ13.8
          ÿÿÿÿÿoverallÿ=ÿ0.6771ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿ30

          ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(4,60)ÿÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿ23.94
          corr(u_i,ÿXb)ÿÿ=ÿ0.4150ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0000

          ------------------------------------------------------------------------------
          ÿÿÿÿÿforeignÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
          -------------+----------------------------------------------------------------
          ÿÿgear_ratioÿ|ÿÿÿ.5062101ÿÿÿÿ.106822ÿÿÿÿÿ4.74ÿÿÿ0.000ÿÿÿÿÿ.2925344ÿÿÿÿ.7198859
          ÿÿÿÿÿÿÿpriceÿ|ÿÿÿÿ.000054ÿÿÿ.0000125ÿÿÿÿÿ4.31ÿÿÿ0.000ÿÿÿÿÿ.0000289ÿÿÿÿÿ.000079
          ÿÿÿÿÿÿweightÿ|ÿÿÿ-.000308ÿÿÿ.0000819ÿÿÿÿ-3.76ÿÿÿ0.000ÿÿÿÿ-.0004719ÿÿÿ-.0001441
          ÿÿÿÿÿÿÿÿÿmpgÿ|ÿÿÿ-.023868ÿÿÿ.0086565ÿÿÿÿ-2.76ÿÿÿ0.008ÿÿÿÿ-.0411836ÿÿÿ-.0065525
          ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.1034663ÿÿÿ.5618534ÿÿÿÿ-0.18ÿÿÿ0.855ÿÿÿÿÿ-1.22734ÿÿÿÿ1.020408
          -------------+----------------------------------------------------------------
          ÿÿÿÿÿsigma_uÿ|ÿÿ.18713438
          ÿÿÿÿÿsigma_eÿ|ÿÿ.23817767
          ÿÿÿÿÿÿÿÿÿrhoÿ|ÿÿ.38169031ÿÿÿ(fractionÿofÿvarianceÿdueÿtoÿu_i)
          ------------------------------------------------------------------------------
          Fÿtestÿthatÿallÿu_i=0:ÿF(4,ÿ60)ÿ=ÿ4.90ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿ=ÿ0.0017

          .ÿ
          .ÿexit

          endÿofÿdo-file


          .


          I've been under the impression that both linear probability models and stepwise estimation are a bit sixtiesish. Have I missed something?

          Comment


          • #6
            I had can't redo your routine

            Comment

            Working...
            X