Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • large Coefficient and its solution

    Dear profs and colleagues,

    I am running this model. As you can see the Coefficient and std. err are super large. Do you know what is the reason and what can be done about that?
    Code:
    input float ln_mig_firm double firm_age float(foreign_aff region) byte per float(sector immi_sh S_emplo_sh impu_sh_origin S_emplo_iv)
     .6931472   3 1 3 0  9 .06990359 0 .024657136 0
    2.0794415 151 0 1 0  3 .01748677 0 .019646525 0
    1.0986123  85 1 3 0  7 .06990359 0 .024657136 0
    1.0986123  38 1 3 0  7 .06990359 0 .024657136 0
    1.0986123  37 1 3 0  6 .06990359 0 .024657136 0
    1.3862944  81 1 2 0  3 .03521399 0 .018607123 0
     .6931472  40 1 1 0  7 .01748677 0 .019646525 0
     .6931472  53 1 5 0  3 .05653952 0 .005587324 0
      1.94591  54 1 2 0  3 .03521399 0 .018607123 0
     .6931472  52 1 3 0 13 .06990359 0 .024657136 0
     .6931472  53 0 1 0  7 .01748677 0 .019646525 0
    2.0794415  69 1 1 0  3 .01748677 0 .019646525 0
    1.7917595  40 1 1 0  3 .01748677 0 .019646525 0
     2.944439  39 1 3 0  9 .06990359 0 .024657136 0
    1.3862944  51 1 1 0  3 .01748677 0 .019646525 0
     3.367296  69 1 3 0  6 .06990359 0 .024657136 0
    2.1972246  52 1 1 0  6 .01748677 0 .019646525 0
     3.178054  37 1 3 0  6 .06990359 0 .024657136 0
     .6931472  39 1 3 0  3 .06990359 0 .024657136 0
     1.609438  22 1 3 0  9 .06990359 0 .024657136 0
     .6931472  47 0 1 0  7 .01748677 0 .019646525 0
    1.0986123  90 1 2 0  7 .03521399 0 .018607123 0
     1.609438  43 1 4 0 13 .15849853 0 .010140276 0
    1.0986123  51 1 2 0 11 .03521399 0 .018607123 0
     .6931472  61 1 1 0  7 .01748677 0 .019646525 0
    1.0986123  39 1 3 0  3 .06990359 0 .024657136 0
    2.6390574  37 1 2 0  3 .03521399 0 .018607123 0
      1.94591  36 1 2 0  3 .03521399 0 .018607123 0
     2.397895  81 1 2 0  3 .03521399 0 .018607123 0
    1.0986123  56 1 1 0  7 .01748677 0 .019646525 0
     1.609438  40 1 1 0  3 .01748677 0 .019646525 0
    3.4011974  71 1 3 0  9 .06990359 0 .024657136 0
    2.0794415  38 1 3 0  7 .06990359 0 .024657136 0
     .6931472  44 1 1 0  3 .01748677 0 .019646525 0
     .6931472  36 1 2 0  3 .03521399 0 .018607123 0
     1.609438  40 1 1 0  3 .01748677 0 .019646525 0
     1.609438  44 0 3 0  3 .06990359 0 .024657136 0
     2.397895  45 1 3 0  3 .06990359 0 .024657136 0
    2.6390574  45 1 2 0  3 .03521399 0 .018607123 0
    1.0986123  46 1 3 0  9 .06990359 0 .024657136 0
     2.484907  46 0 1 0  3 .01748677 0 .019646525 0
    xtivreg ln_mig_firm firm_age foreign_aff   i.region#i.per i.sector#i.per (immi_sh S_emplo_sh= impu_sh_origin S_emplo_iv ) , fe first vce(robust)
    
    
    Fixed-effects (within) IV regression            Number of obs     =     15,213
    Group variable: NPC_FIC                         Number of groups  =      3,998
    
    R-squared:                                      Obs per group:
         Within  = 0.0810                                         min =          1
         Between = 0.0011                                         avg =        3.8
         Overall = 0.0001                                         max =         10
    
    
                                                    Wald chi2(71)     =   55406.90
    corr(u_i, Xb) = -0.5211                         Prob > chi2       =     0.0000
    
                                (Std. err. adjusted for 3,998 clusters in NPC_FIC)
    ------------------------------------------------------------------------------
                 |               Robust
     ln_mig_firm | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
         immi_sh |   66.08293   5.802629    11.39   0.000     54.70999    77.45587
      S_emplo_sh |  -12.28756   9.261594    -1.33   0.185    -30.43995    5.864832
        firm_age |  -.0292598   .0076699    -3.81   0.000    -.0442925    -.014227
     foreign_aff |  -.1434471   .0831989    -1.72   0.085    -.3065139    .0196198
                 |
      region#per |
            1 1  |  -.2850273    .253099    -1.13   0.260    -.7810923    .2110377
    Cheers,
    Paris

  • #2
    In a linear model, the size of the coef depends on the scale of the variable. SE isn't large relative to coefficient.

    Comment


    • #3
      Dear Ford,

      whats your suggestion to reduce the scales?

      Comment


      • #4
        rescaling of the Xs. multiply immi_sh by 100. and the coef will by 0.66.

        Or, use margins to get the effect. It will account for the scale.

        Comment


        • #5
          what about STANDARDIZing VARIABLES ?
          Code:
          egen S_ln_mig_firm = std(ln_mig_firm)
          egen S_firm_age = std(firm_age)
          egen S_foreign_aff = std(foreign_aff)
          egen S_immi_sh = std(immi_sh)
          
                                                       F(23,3997)        =          .
          corr(u_i, Xb) = -0.3411                         Prob > F          =          .
          
                                       (Std. err. adjusted for 3,998 clusters in NPC_FIC)
          -------------------------------------------------------------------------------
                        |               Robust
          S_ln_mig_firm | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          --------------+----------------------------------------------------------------
              S_immi_sh |  -.0867684   .0904104    -0.96   0.337    -.2640232    .0904863
             S_firm_age |  -.1856058   .1545148    -1.20   0.230     -.488541    .1173294
          S_foreign_aff |  -.0547414    .030351    -1.80   0.071    -.1142463    .0047635
                        |
                 sector |
                     6  |  -.3900656   .2201712    -1.77   0.077     -.821724    .0415927
          am I allowed to do so?

          Comment


          • #6
            No one will put you in jail for standardizing coefficients, but can you meaningfully interpret them. Standardizing is useful when your variable has no meaningful scale (e.g. some index), otherwise you should stay away from that as you will loose more than you will gain.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              Have you run it without the iv? Could be a problem with a bad IV. If the size is a lot different, then I'd start there.

              Comment


              • #8
                Originally posted by Maarten Buis View Post
                No one will put you in jail for standardizing coefficients, but can you meaningfully interpret them.
                Dear Maarten, Thank you for getting back to me and good news about jail
                The point is that independent variables do not have the same scales, one is between 0 and 1 others are between 1 to 10. By standardizing through the centering method I reach reasonable results kinda.

                The second question arises, "How should I interpret the standardized coefficient"? Does the interpretation differ from the unstandardized one?


                Comment


                • #9
                  Originally posted by George Ford View Post
                  Have you run it without the iv? Could be a problem with a bad IV.
                  Dear Ford,
                  The situation is mixed. First of all, your assumption of different scales is totally correct. Secondly, once I regress --xtreg ln_mig_firm immi_sh firm_age foreign_aff i.sector i.region i.year , fe robust--without IV the coefficients make sense.
                  So I dont know where is the problem exactly, Different scales or Bad IV or Both!

                  Comment


                  • #10
                    Originally posted by Paris Rira View Post
                    The point is that independent variables do not have the same scales, one is between 0 and 1 others are between 1 to 10.
                    Why would that be a problem? You just need to know what the scales mean, and interpret your coefficients correctly. It does not matter whether one number is bigger than the other, because you interpret coefficients at their own merrit not one compared to another.
                    ---------------------------------
                    Maarten L. Buis
                    University of Konstanz
                    Department of history and sociology
                    box 40
                    78457 Konstanz
                    Germany
                    http://www.maartenbuis.nl
                    ---------------------------------

                    Comment


                    • #11
                      Actually it affects the coefficient. Please have a look at #1.
                      immi_sh | 66.08293 such a large number

                      Comment


                      • #12
                        I think you got an IV problem if that big coef is absent in xtreg.

                        Comment


                        • #13
                          Finding a proper IV is not easy indeed. This IV has already worked in another dependent variable with this new dependent variable does not .

                          Comment


                          • #14
                            Might have a scale problem there too. You could study the first stage prediction. Probably a couple of outliers or something else strange.

                            Comment


                            • #15
                              Originally posted by Paris Rira View Post
                              Actually it affects the coefficient. Please have a look at #1.
                              immi_sh | 66.08293 such a large number
                              Ofcourse, and it should. But it does not matter, as long as you interpret it correctly. Assuming that the variable immi_sh is the share of immigrants, theoretically ranging from 0 (no immigrants) to 1 (everybody is an immigrant). In that case your interpretation is that your expected outcome changes by 66 (whatever the unit is) if you go from a no immigrant society to an all immigrant society. You consider the effect large, but that makes sense: the change in x is also pretty extreme...

                              As George Ford already mentioned in #4 it is probably easier to interpret if you first multiply immi_sh by 100 so you are talking about a percentage point change in immigrant share rather than going from no immigrants to all immigrants. That way your interpretation will be that your outcome changes 0.66 for a percentage point change in immigrant share. Notice that this is just a different way of saying the same thing, so the underlying model does not change even though the numbers change.

                              Standardizing has a couple of problems: now your interpretation is your outcome changes by b standard deviations for a standard deviation change in immigrant share. What does that mean? On top of that you are dealing with panel data: what standard deviation do you want to use? I strongly recommend you stay away from that can of worms.
                              ---------------------------------
                              Maarten L. Buis
                              University of Konstanz
                              Department of history and sociology
                              box 40
                              78457 Konstanz
                              Germany
                              http://www.maartenbuis.nl
                              ---------------------------------

                              Comment

                              Working...
                              X