Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hi Clyde,

    Sorry to bother you yesterday with simple questions about collinearity. I've made the level of my analysis more simple with the hope of making sure I'm calculating margins (the original topic of this thread) correctly. The motivation for the regression is that I'd like to use it to calculate predicted values for each state if all RHS covariates are set at the global means. I'd think the following code would accomplish that:

    Code:
    reg deflator_temp c.win_margin unopposed senate president governor i.non_presidential_year i.non_senate_year i.non_gubernatorial_year i.election_cycle#i.state_code, r noconstant
    margins state_code#election_cycle, atmeans
    However, I still have one election_cycle#state_code combination that is collinear and drops. So I thought I'd look at just one state and two years to make sure I am calculating margins as I'd like it to. When I do this, however, I am unable to produce estimates. There's no collinearity, no empty combinations, and an F-stat is reported, so I'm not sure why this is happening. Do you have an idea why? Thanks!

    Code:
    ​
    keep if year>2008
    (2794 observations deleted)
    
    . keep if state_code==5
    (919 observations deleted)
    
    . reg deflator_temp c.win_margin unopposed senate president governor i.non_presidential_year i.ele
    > ction_cycle, noconstant
    
          Source |       SS       df       MS              Number of obs =     110
    -------------+------------------------------           F(  7,   103) =  312.91
           Model |   26.303232     7  3.75760458           Prob > F      =  0.0000
        Residual |  1.23690088   103  .012008746           R-squared     =  0.9551
    -------------+------------------------------           Adj R-squared =  0.9520
           Total |  27.5401329   110  .250364845           Root MSE      =  .10958
    
    -----------------------------------------------------------------------------------------
              deflator_temp |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ------------------------+----------------------------------------------------------------
                 win_margin |   .0808634   .0586245     1.38   0.171    -.0354045    .1971313
                  unopposed |  -.1483611   .0881765    -1.68   0.095    -.3232385    .0265162
                     senate |   .0091408   .0786012     0.12   0.908    -.1467461    .1650277
                  president |    .006913   .1106506     0.06   0.950    -.2125364    .2263623
                   governor |   .0342378    .111239     0.31   0.759    -.1863786    .2548542
    1.non_presidential_year |   .3957682   .0244157    16.21   0.000     .3473454     .444191
          17.election_cycle |   .5264039   .0224857    23.41   0.000     .4818089     .570999
    -----------------------------------------------------------------------------------------
    
    . margins election_cycle, atmeans 
    
    Adjusted predictions                              Number of obs   =        110
    Model VCE    : OLS
    
    Expression   : Linear prediction, predict()
    at           : win_margin      =    .3129933 (mean)
                   unopposed       =    .0181818 (mean)
                   senate          =    .0181818 (mean)
                   president       =    .0090909 (mean)
                   governor        =    .0090909 (mean)
                   0.non_pres~r    =          .5 (mean)
                   1.non_pres~r    =          .5 (mean)
                   16.electio~e    =          .5 (mean)
                   17.electio~e    =          .5 (mean)
    
    --------------------------------------------------------------------------------
                   |            Delta-method
                   |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ---------------+----------------------------------------------------------------
    election_cycle |
               16  |          .  (not estimable)
               17  |          .  (not estimable)
    --------------------------------------------------------------------------------

    Comment


    • #17
      Well, non-estimability results when there are empty cells in the design. So there are some combinations of 16.election_cycle and 17.election_cycle with the other model variables (win_margin unopposed senate president governor non_presidential_year) that simply do not occur in your data. You can run a series of cross-tabs to see what those are. You need to get more data to fill in those cells to truly solve the problem.

      If that isn't possible, there is a work around that may or may not be suitable for your project. If you are willing to assume that the results of the empty cells, were they not empty, would not affect the estimates (a pretty strong assumption, so don't make it just to get an answer), you can specify the -asbalanced emptycells(reweight)- options to the -margins- command. This will reanalyze your margins as if you had a design that was balanced on all variables and will handle the empty cells as if they were like the non-empty cells. But, again, this will give you answers--but they could be very if the underlying assumption is incorrect.

      If that assumption is incorrect, or margins as balanced are not appropriate for your purpose, and you can't get more data to fill in the gaps, then I'm pretty sure you are stuck.

      Comment


      • #18
        If that assumption is incorrect, or margins as balanced are not appropriate for your purpose, and you can't get more data to fill in the gaps, then I'm pretty sure you are stuck.
        Actually, on second thought, there is another option: remove from the model those variables other than election cycle that define the empty cells. Of course, you have to consider whether a model that lacks those variables is scientifically credible.

        Comment


        • #19
          Hi
          I have a question regarding this missing margins.
          Code:
          reg drinkalcohol i.year##i.bihar age agesq i.sex i.currentmarital i.hh_religion i.wealthindex i.residenceplace  hhsize   i.diistrict , cluster(psu)
          margins year#bihar
          marginsplot, xdimension(year)
          This gives an output for the margins as
          Code:
          Predictive margins                                     Number of obs = 323,830
          Model VCE: Robust
          
          Expression: Linear prediction, predict()
          
          ------------------------------------------------------------------------------------------
                                   |            Delta-method
                                   |     Margin   std. err.      t    P>|t|     [95% conf. interval]
          -------------------------+----------------------------------------------------------------
                        year#bihar |
                 NFHS 4 (Pre Ban) #|
          States other than Bihar  |          .  (not estimable)
           NFHS 4 (Pre Ban)#Bihar  |          .  (not estimable)
                NFHS 5(Post Ban)  #|
          States other than Bihar  |          .  (not estimable)
          NFHS 5(Post Ban) #Bihar  |          .  (not estimable)
          ---------------------------------------------------------
          I do not understand why this is so, because my earlier regression gives a full output as follows

          Code:
          Linear regression                               Number of obs     =    323,830
                                                          F(145, 11409)     =      52.34
                                                          Prob > F          =     0.0000
                                                          R-squared         =     0.2187
                                                          Root MSE          =     .20635
          
                                                     (Std. err. adjusted for 11,410 clusters in psu)
          ------------------------------------------------------------------------------------------
                                   |               Robust
                      drinkalcohol | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
          -------------------------+----------------------------------------------------------------
                              year |
                NFHS 5(Post Ban)   |  -.0313667   .0022244   -14.10   0.000    -.0357271   -.0270064
                                   |
                             bihar |
                            Bihar  |  -.1034804   .0226883    -4.56   0.000    -.1479534   -.0590073
                                   |
                        year#bihar |
          NFHS 5(Post Ban) #Bihar  |   .0194041   .0026456     7.33   0.000     .0142182      .02459
                                   |
                               age |   .0040023   .0003783    10.58   0.000     .0032607    .0047439
                             agesq |  -.0000386   5.89e-06    -6.55   0.000    -.0000501   -.0000271
                                   |
                               sex |
                             Male  |   .2695794   .0038684    69.69   0.000     .2619967    .2771621
                                   |
                    currentmarital |
                          married  |    .018982   .0013906    13.65   0.000     .0162562    .0217077
                                   |
                       hh_religion |
                           Muslim  |  -.0396486   .0017443   -22.73   0.000    -.0430677   -.0362295
                           Others  |   .0023733   .0071861     0.33   0.741    -.0117126    .0164593
                                   |
                       wealthindex |
                           poorer  |  -.0263213   .0014386   -18.30   0.000    -.0291412   -.0235013
                           middle  |  -.0371762   .0016381   -22.69   0.000    -.0403872   -.0339653
                           richer  |  -.0468664   .0018101   -25.89   0.000    -.0504144   -.0433184
                          richest  |  -.0564057   .0021809   -25.86   0.000    -.0606807   -.0521307
                                   |
                    residenceplace |
                            rural  |  -.0023045   .0017718    -1.30   0.193    -.0057775    .0011685
                            hhsize |   .0008145   .0001934     4.21   0.000     .0004355    .0011935
          Just in case you were wondering, the reason i did not use trendplots from didregress is that trendplots do not seem to work with missing data. Also asbalanced empty(reweight) gives me the same results. Here's the data example from dataex
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input byte drinkalcohol float(year bihar) byte age float(agesq sex) byte(currentmarital hh_religion wealthindex residenceplace hhsize) int diistrict
          0 1 0 36 1296 0 1 0 1 2  3 182
          0 1 0 21  441 0 1 0 1 2  5 182
          0 1 0 37 1369 0 1 0 5 2  4 182
          0 1 0 26  676 1 1 0 3 2  7 182
          0 1 0 18  324 0 1 0 1 2  5 182
          0 1 0 28  784 0 1 0 1 2  3 182
          0 1 0 49 2401 0 1 1 2 2  3 182
          0 1 0 17  289 0 0 0 1 2  6 182
          0 1 0 22  484 0 1 0 1 2  6 182
          0 1 0 23  529 0 1 0 1 2  7 182
          1 1 0 32 1024 0 1 0 1 2  8 182
          0 1 0 25  625 0 1 0 4 2  8 182
          0 1 0 35 1225 0 1 0 1 2  2 182
          1 1 0 47 2209 0 1 0 2 2  5 182
          0 1 0 38 1444 0 1 0 1 2  6 182
          0 1 0 20  400 1 0 0 1 2  5 182
          0 1 0 17  289 0 0 0 1 2  5 182
          0 1 0 28  784 0 1 0 1 2  7 182
          0 1 0 44 1936 0 1 0 1 2  7 182
          0 1 0 28  784 0 1 0 5 2  8 182
          0 1 0 18  324 0 0 0 2 2  6 182
          0 1 0 23  529 0 1 0 1 2 13 182
          0 1 0 49 2401 0 1 0 5 1  7 182
          0 1 0 22  484 0 1 0 1 2  8 182
          0 1 0 21  441 0 1 0 1 2  4 182
          0 1 0 28  784 0 1 0 3 2  7 182
          1 1 0 49 2401 1 1 0 1 2  4 182
          0 1 0 23  529 0 1 0 3 2  8 182
          0 1 0 20  400 0 1 0 2 2  9 182
          0 1 0 45 2025 0 1 0 2 2  4 182
          0 1 0 28  784 0 1 0 1 2  5 182
          1 1 0 35 1225 0 1 0 1 2  7 182
          0 1 0 31  961 1 1 0 4 2  4 182
          0 1 0 41 1681 0 1 0 1 2  6 182
          0 1 0 21  441 0 . 0 1 2  6 182
          0 1 0 26  676 0 1 0 1 2  5 182
          0 1 0 24  576 0 1 0 1 2  3 182
          0 1 0 17  289 0 0 0 3 2  6 182
          0 1 0 23  529 0 1 0 1 2  5 182
          0 1 0 25  625 0 0 0 1 2  3 182
          0 1 0 22  484 0 1 0 1 2  4 182
          0 1 0 31  961 0 1 0 1 2  3 182
          0 1 0 34 1156 0 1 0 1 2  4 182
          0 1 0 44 1936 0 1 0 2 2  8 182
          0 1 0 18  324 0 1 0 2 2  2 182
          0 1 0 21  441 0 1 1 4 2  3 182
          0 1 0 32 1024 0 1 0 1 2  5 182
          0 1 0 30  900 1 1 0 3 1  9 182
          0 1 0 18  324 0 1 0 3 2  5 182
          0 1 0 32 1024 0 1 0 1 2  5 182
          0 1 0 17  289 0 0 0 1 2  2 182
          0 1 0 30  900 1 1 0 1 2  5 182
          0 1 0 20  400 0 0 2 1 2  4 182
          0 1 0 29  841 0 1 0 1 2  2 182
          0 1 0 17  289 0 0 0 1 2  7 182
          0 1 0 54 2916 1 1 0 4 2  6 182
          0 1 0 37 1369 0 1 0 5 2  9 182
          0 1 0 22  484 0 0 0 1 2  9 182
          1 1 0 33 1089 0 1 0 1 2  5 182
          0 1 0 32 1024 1 1 0 1 2  6 182
          0 1 0 28  784 0 0 0 4 1  4 182
          0 1 0 26  676 1 1 0 1 2  3 182
          0 1 0 22  484 0 1 0 1 2  4 182
          0 1 0 35 1225 0 1 0 2 2  6 182
          0 1 0 24  576 0 0 0 2 2  8 182
          1 1 0 46 2116 0 1 0 1 2  4 182
          1 1 0 22  484 0 1 0 1 2  3 182
          0 1 0 32 1024 0 1 0 1 2  6 182
          1 1 0 45 2025 0 1 0 1 2  2 182
          1 1 0 21  441 0 1 0 1 2  2 182
          0 1 0 26  676 0 1 0 1 2  8 182
          1 1 0 32 1024 1 1 0 1 2  5 182
          0 1 0 39 1521 0 1 0 2 2  2 182
          0 1 0 26  676 0 1 0 1 2  6 182
          0 1 0 32 1024 1 1 0 1 2  5 182
          1 1 0 31  961 1 1 0 1 2  6 182
          0 1 0 23  529 0 1 0 1 2  4 182
          0 1 0 26  676 0 0 0 5 1  7 182
          0 1 0 18  324 0 1 0 1 2  5 182
          0 1 0 39 1521 0 1 0 1 2  6 182
          0 1 0 27  729 0 0 0 2 2  4 182
          0 1 0 35 1225 0 1 0 1 2  5 182
          0 1 0 16  256 0 0 0 1 2  7 182
          0 1 0 32 1024 0 1 0 1 2  4 182
          0 1 0 35 1225 0 1 0 3 1  5 182
          0 1 0 41 1681 0 1 0 1 2  5 182
          0 1 0 30  900 0 1 0 1 2  7 182
          1 1 0 37 1369 0 1 2 1 2  5 182
          0 1 0 45 2025 0 1 0 1 2  5 182
          1 1 0 40 1600 0 1 0 1 2  6 182
          0 1 0 23  529 0 1 0 1 2  6 182
          0 1 0 40 1600 0 1 0 1 2  4 182
          0 1 0 17  289 0 0 0 2 2  5 182
          0 1 0 16  256 0 0 0 1 1  3 182
          0 1 0 15  225 0 0 0 5 2  5 182
          0 1 0 21  441 0 0 0 1 2  6 182
          0 1 0 35 1225 0 1 0 1 2  4 182
          0 1 0 43 1849 0 1 0 1 2  5 182
          1 1 0 28  784 1 1 0 1 2  3 182
          0 1 0 34 1156 1 1 0 5 2  4 182
          end
          label values drinkalcohol S720
          label def S720 0 "no", modify
          label def S720 1 "yes", modify
          label values year year
          label def year 1 "NFHS 5(Post Ban)", modify
          label values bihar treatment
          label def treatment 0 "States other than Bihar", modify
          label values sex sex
          label def sex 0 "Female", modify
          label def sex 1 "Male", modify
          label values currentmarital maritalstatus
          label def maritalstatus 0 "unmarried", modify
          label def maritalstatus 1 "married", modify
          label values hh_religion religion
          label def religion 0 "Hindu", modify
          label def religion 1 "Muslim", modify
          label def religion 2 "Others", modify
          label values wealthindex V190
          label def V190 1 "poorest", modify
          label def V190 2 "poorer", modify
          label def V190 3 "middle", modify
          label def V190 4 "richer", modify
          label def V190 5 "richest", modify
          label values residenceplace V025
          label def V025 1 "urban", modify
          label def V025 2 "rural", modify
          label values diistrict SDIST
          label def SDIST 182 "balrampur", modify
          Last edited by Rajdeep Chaudhuri; 26 Jan 2025, 14:30.

          Comment


          • #20
            First off, I personally would start a new thread rather than add on to a fairly lengthy thread that is more than 10 years old.

            Unfortunately, your dataex example doesn't help. You don't include the psu variable, so the reg command does not work as written. Further, year and bihar are constants in your extract:

            Code:
            . tab1 year bihar
            
            -> tabulation of year  
            
                        year |      Freq.     Percent        Cum.
            -----------------+-----------------------------------
            NFHS 5(Post Ban) |        100      100.00      100.00
            -----------------+-----------------------------------
                       Total |        100      100.00
            
            -> tabulation of bihar  
            
                              bihar |      Freq.     Percent        Cum.
            ------------------------+-----------------------------------
            States other than Bihar |        100      100.00      100.00
            ------------------------+-----------------------------------
                              Total |        100      100.00
            I find it puzzling that you are having problems. Perhaps drop the cluster option and see if that helps. Also make sure your version of Stata is up to date. Occasionally you can get lucky and find that what seems to be a bug has already been fixed.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 19.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment

            Working...
            X