Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why won't my test for fixed vs random effects models work? The Mundlak approach.

    I have panel data on women measured across 3 waves. I am unsure as to whether to model them with fixed or random effects models. I cluster my regression at the respondents current county so I cannot use a Hausman test, I attempt to use the Mundlak approach as described on the Stata blog (https://blog.stata.com/2015/10/29/fi...dlak-approach/)

    On the blog, computing the test is described as follows:
    1. Compute the panel-level average of your time-varying covariates.
    2. Use a random-effects estimator to regress your covariates and the panel-level means generated in (1) against your outcome.
    3. Test that the panel-level means generated in (1) are jointly zero.
    But when I do this, I find the following results:

    Code:
    capture drop mean_psum_y
    bysort id: egen mean_psum_y = mean(psum_unemployed_total_cont_y)
    (108 missing values generated)
    
    capture drop mean_age_y
    bysort id: egen mean_age_y = mean(age_y)
    (183 missing values generated)
    
    capture drop mean_year
    bysort id: egen mean_year = mean(year)
    
    capture drop mean_current_county_y1
    bysort id: egen mean_current_county_y1 = mean(current_county_y1)
    
    capture drop mean_own_education_y
    bysort id: egen mean_own_education_y = mean(own_education_y)
    (90 missing values generated)

    Code:
    . xtreg binary_health_y psum_unemployed_total_cont_y calt3_other_children_y0 i.year i.current_county_y1 i.own_education_y
    >  age_y mean_current_county_y1 mean_own_education_y mean_year mean_age_y mean_psum_y if has_y0_questionnaire==1 & has_y5
    > _questionnaire==1 | has_y0_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 & has_y5_questionnaire
    > ==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==0 | has_y0_questionnaire
    > ==1 & cbmi_y10 !=. & has_y10_questionnaire==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==0 & cbmi_
    > y10 !=. & has_y10_questionnaire==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 | has_y0_questionn
    > aire==1 & cbmi_y10 !=. & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 & c
    > bmi_y10 !=. & has_y10_questionnaire==1, cluster (current_county_y1) re
    note: mean_current_county_y1 omitted because of collinearity
    note: mean_own_education_y omitted because of collinearity
    note: mean_year omitted because of collinearity
    note: mean_age_y omitted because of collinearity
    note: mean_psum_y omitted because of collinearity
    
    Random-effects GLS regression                   Number of obs     =      1,578
    Group variable: id                              Number of groups  =        635
    
    R-sq:                                           Obs per group:
         within  = 0.0063                                         min =          1
         between = 0.0892                                         avg =        2.5
         overall = 0.0585                                         max =          3
    
                                                    Wald chi2(9)      =          .
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =          .
    
                                                                   (Std. Err. adjusted for 29 clusters in current_county_y1)
    ------------------------------------------------------------------------------------------------------------------------
                                                           |               Robust
                                           binary_health_y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------------------------------------------------+----------------------------------------------------------------
                              psum_unemployed_total_cont_y |   -.011246   .0042776    -2.63   0.009      -.01963    -.002862
                                   calt3_other_children_y0 |  -.0132467   .0144022    -0.92   0.358    -.0414744     .014981
                                                           |
                                                      year |
                                                        5  |  -.0797771    .026846    -2.97   0.003    -.1323944   -.0271598
                                                       10  |   .0380614   .0343703     1.11   0.268     -.029303    .1054259
                                                           |
                                         current_county_y1 |
                                                    Cavan  |   .3497291   .0472267     7.41   0.000     .2571665    .4422917
                                                    Clare  |  -.0653339   .0108904    -6.00   0.000    -.0866788   -.0439891
                                                     Cork  |  -.1672661   .0185473    -9.02   0.000    -.2036181   -.1309142
                                                  Donegal  |   .2440473   .0409964     5.95   0.000     .1636959    .3243987
                                              Dublin City  |  -.0319963   .0105698    -3.03   0.002    -.0527128   -.0112799
                                   DĂșn Laoghaire-Rathdown  |  -.0261354    .023202    -1.13   0.260    -.0716106    .0193397
                                                   Fingal  |   .0898723   .0135182     6.65   0.000     .0633771    .1163674
                                                   Galway  |  -.0208434   .0111599    -1.87   0.062    -.0427165    .0010297
                                              Galway City  |   .0171861   .0085169     2.02   0.044     .0004932     .033879
                                                    Kerry  |   .1631191   .0171811     9.49   0.000     .1294448    .1967933
                                                  Kildare  |   -.052603   .0137643    -3.82   0.000    -.0795805   -.0256256
                                                 Kilkenny  |   -.054465   .0247041    -2.20   0.027    -.1028841   -.0060459
                                                    Laois  |  -.1723436   .0261493    -6.59   0.000    -.2235953   -.1210919
                                                 Limerick  |   .1516061   .0256513     5.91   0.000     .1013305    .2018817
                                                 Longford  |   .3243542   .0275731    11.76   0.000      .270312    .3783964
                                                    Louth  |   .2935245   .0208773    14.06   0.000     .2526058    .3344432
                                                     Mayo  |  -.0083414   .0164228    -0.51   0.612    -.0405295    .0238468
                                                    Meath  |  -.0115489   .0164661    -0.70   0.483    -.0438219    .0207242
                                                 Monaghan  |   -.393205   .0254279   -15.46   0.000    -.4430427   -.3433672
                                                   Offaly  |   -.117562   .0079983   -14.70   0.000    -.1332383   -.1018857
                                                Roscommon  |   .1387658   .0161138     8.61   0.000     .1071833    .1703483
                                                    Sligo  |  -.8495427    .023403   -36.30   0.000    -.8954118   -.8036736
                                             South Dublin  |  -.1471678   .0090874   -16.19   0.000    -.1649788   -.1293568
                                          Tipperary North  |   .1399979   .0234896     5.96   0.000     .0939591    .1860366
                                                Waterford  |  -.0394648   .0214446    -1.84   0.066    -.0814955    .0025659
                                                Westmeath  |  -.0384343   .0119437    -3.22   0.001    -.0618435   -.0150252
                                                  Wexford  |   .0530227   .0179453     2.95   0.003     .0178506    .0881948
                                                  Wicklow  |  -.0002635   .0148818    -0.02   0.986    -.0294314    .0289043
                                                           |
                                           own_education_y |
                                             No schooling  |          0  (empty)
                                 Primary school education  |   .4571419   .2121907     2.15   0.031     .0412558    .8730279
                                    Some secondary school  |   .6485139   .0774671     8.37   0.000     .4966812    .8003467
                             Complete secondary education  |   .6711843   .1152782     5.82   0.000     .4452432    .8971255
    Some third level education at college, university, ..  |   .7155908    .125676     5.69   0.000     .4692704    .9619112
    Complete third level education at college, universi..  |   .8162614   .1183988     6.89   0.000     .5842041    1.048319
                                                           |
                                                     age_y |   .0049239   .0039453     1.25   0.212    -.0028088    .0126565
                                    mean_current_county_y1 |          0  (omitted)
                                      mean_own_education_y |          0  (omitted)
                                                 mean_year |          0  (omitted)
                                                mean_age_y |          0  (omitted)
                                               mean_psum_y |          0  (omitted)
                                                     _cons |          0  (omitted)
    -------------------------------------------------------+----------------------------------------------------------------
                                                   sigma_u |  .25259967
                                                   sigma_e |  .35438184
                                                       rho |  .33690033   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------------------------------------------------
    
    . 
    . capture drop mundlak
    
    . estimates store mundlak
    Below I test the joint significance of the added means as instructed on the Stata blog, and this is where I have my problem

    Code:
    
    . test mean_psum_y  mean_age_y mean_year mean_current_county_y1 mean_own_education_y
    
     ( 1)  o.mean_psum_y = 0
     ( 2)  o.mean_age_y = 0
     ( 3)  o.mean_year = 0
     ( 4)  o.mean_current_county_y1 = 0
     ( 5)  o.mean_own_education_y = 0
           Constraint 1 dropped
           Constraint 2 dropped
           Constraint 3 dropped
           Constraint 4 dropped
           Constraint 5 dropped
    
               chi2(  0) =       .
             Prob > chi2 =         .
    As you can see there is no output at all for this test.

    Previously I had included more controls (which I now exclude due to endogeneity fears) and had gotten output from this test. Can anyone advise me why I am getting no results now and how to remedy this?

    Best,

    John

  • #2
    A short but unsatisfying answer is: all your mean_* variables are omitted from the regression model due to collinearity; you cannot test coefficients that are not estimated. My guess is that the mean_* variables are constant within id's, e.g., individuals do not change educational level over time, but you need to figure out what exactly is going on.

    One more comment: the if condition for your regression model runs over three lines but you compute the mean values byid for the complete sample (although different variables seem to have a different number of missing values). The means should be computed on the exact same sample that is used in the regression model, instead.

    Best
    Daniel

    Comment


    • #3
      Thank you for your response Daniel.

      I had a feeling that it was something like this but as age, year, etc., automatically increase across waves I just can't seem to reconcile this with my results. Thank you for the advice on my means which I have updated accordingly.

      Best,

      Jonathan

      Comment


      • #4
        John, at this stage what you are presenting is overwhelming.

        Can you not generate a toy example from your data, or try the Mundlak approach on a toy dataset and see whether it works?

        Your verbal description of the Mundlak approach above is correct, what you re saying is what need to do to implement it.

        Comment


        • #5
          Thank you Joro for your response. I have tried this approach with this data and different variables for the panel level means and it has worked, I can only assume that there is something wrong with the current means as below:
          Code:
           
           mean_psum_y  mean_age_y mean_year mean_current_county_y1 mean_own_education_y
          That these means do not vary across individuals or waves is the most logical conclusion. However, actually browsing these variables as below confirms that they do vary across waves:

          Code:
          id            year    mean_psum_y    mean_age_y    mean_year    mean_current_county_y1    mean_own_education_y
          200051    0          7.781173          30.06398                   0                     11.13612                              4.592351
          200051    10       16.86173          40.06398                   10                    11.13612                              4.592351
          200051    5          7.54015           35.06398                    5                      11.13612                             4.592351
          200071    0          7.781173         30.06398                    0                       11.13612                             4.592351
          200071    5          7.54015          35.06398                     5                       11.13612                             4.592351
          200071    10        16.86173        40.06398                    10                       11.13612                            4.592351

          So I really can't determine whats going wrong here at all

          Comment


          • #6
            If you have

            xtset id year

            your data, yes, these means are wrong. They have to be constant within id. And they are not.

            you generate those means by

            egen meanx = mean(x), by(id)

            .

            Comment


            • #7
              Hi Joro,

              I do both of these in my analysis (xtset id year) and generating means by id and still face this problem. Can you explain to me what you mean by constance within id and why the snapshot of means I supplied are incorrect?

              Very best,

              John

              Comment


              • #8
                I do not see any errors in your code how you generate the panel specific means.

                bysort id: egen meanx = mean(x)

                is valid syntax.

                But then when you present the sumsample of your data, your means are clearly not constant within id.

                E.g., for id = 200051

                your mean_psum_y

                is 7.781173 for year=0
                then 16.86173 for year=10
                then 7.54015 for year = 5.

                This is definitely not a constant mean_psum_y within id = 200051

                Comment


                • #9
                  Hi Joro,

                  Thank you for confirming my syntax is valid. I'm not sure I understand the logic that means should be constant within id, shouldnt an individuals mean age change depending on what wave they are in? i.e. if they are in wave 1 they may be 30, then 5 years later they would be 35 and 5 years after that they would be 40? Regardless my syntax seems to want to create panel specific means, I just don't see why it can't?

                  Best,

                  John

                  Comment


                  • #10
                    No - by definition a person's mean age is the mean of the ages for that person in the sample. Mean age should vary by person, but not within people. The wave-specific mean for a person is just that person's value in that wave (since only one observation per person per wave).

                    Comment


                    • #11
                      Hi Phil,

                      Thank you, that makes perfect sense, although I simply cannot understand why this is not the case in my own analysis, the syntax is correct, the data was xtset id year. Does anything come to mind as a potential culprit?

                      Best,

                      Jonathan

                      Comment


                      • #12
                        Generating the means manually I was able to create means that vary by person, but not within people, as below:

                        However my Mundlak test is still reporting the following, does anyone have any insight?

                        Code:
                        . test mean_psum_y  mean_age_y mean_year mean_current_county_y1 mean_own_education_y
                        
                         ( 1)  o.mean_psum_y = 0
                         ( 2)  o.mean_age_y = 0
                         ( 3)  o.mean_year = 0
                         ( 4)  o.mean_current_county_y1 = 0
                         ( 5)  o.mean_own_education_y = 0
                               Constraint 1 dropped
                               Constraint 2 dropped
                               Constraint 3 dropped
                               Constraint 4 dropped
                               Constraint 5 dropped
                        
                                   chi2(  0) =       .
                                 Prob > chi2 =         .
                        .


                        Code:
                        year        id            mean_psum_y        mean_age_y    mean_year    mean_current_county_y1    mean_own_education_y
                        0            200051    11.51667                            34.9            .3333333                10                                              4
                        5            200051    11.51667                            34.9            .3333333                10                                              4
                        10            200051    11.51667                            34.9            .3333333                10                                            4
                        0            200071    13.07667                                28.9            .3333333                11                                          5
                        5            200071    13.07667                                28.9            .3333333                 11                                          5
                        10            200071    13.07667                                28.9            .3333333                11                                        5

                        Comment


                        • #13
                          Another piece to add to the puzzle! If I change my continuous variables to ordered variables they are no longer dropped from the Mundlak test! Below I have changed continuous age, to age brackets, i.e. 0-20, 20-30, 30-40, etc., and it is no longer dropped. This variable is continuous within individuals.


                          Code:
                          . test mean_psum_y  mean_ord_age_y mean_year mean_current_county_y1 mean_own_education_y
                          
                           ( 1)  o.mean_psum_y = 0
                           ( 2)  mean_ord_age_y = 0
                           ( 3)  o.mean_year = 0
                           ( 4)  o.mean_current_county_y1 = 0
                           ( 5)  o.mean_own_education_y = 0
                                 Constraint 1 dropped
                                 Constraint 3 dropped
                                 Constraint 4 dropped
                                 Constraint 5 dropped
                          
                                     chi2(  1) =    0.12
                                   Prob > chi2 =    0.7265
                          .

                          Comment

                          Working...
                          X