Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rose Simmons
    started a topic Testing whether to include a squared term

    Testing whether to include a squared term

    Hi,

    I am using a panel dataset.
    vote is my dependent variable: 1 if the respondent voted in an annual leadership election, and 0 otherwise (so I am using nonlinear methods).
    My independent variables include marital status, gender, age etc.

    I then run my regression with only age and age^2 as control variables:

    Code:
    xtprobit vote c.age c.age#c.age, re vce(robust)
    I then conduct the test to see whether age^2 should be included, because I suspect there may be a U-shaped or inverse U-shaped relationship with voting (e.g. very young and very old people may be more or less likely to vote than middle-aged people, in a non-linear relationship).

    Code:
    test age c.age#c.age
    
     ( 1)  [vote]age= 0
     ( 2)  [vote]c.age#c.age= 0
    
               chi2(  2) =    4.34
             Prob > chi2 =    0.1141
    With this result, does this suggest that including age^2 is insignificant, and that perhaps I should only include age?

    I believe this is the appropriate test to see the significant of the squared term, although please could you advise me if I'm mistaken?

    Thank you
    Last edited by Rose Simmons; 05 Mar 2017, 15:15.

  • Latoya Sundack
    replied
    Dear Andrew,

    Thank you very much for your reply.

    Regards!

    Leave a comment:


  • Andrew Musau
    replied
    Hard to say without a data example. However, if you simply want a 1/0/missing variable

    Code:
    gen Vacc = cond(Immu==1, 1, 0) if !missing(Immu)| !missing(Wazs)
    If this does not work, provide a data example using dataex.

    Leave a comment:


  • Latoya Sundack
    replied
    Dear Andrew,

    Thanks so much. It worked.

    I have one more question. I have a total of 5600 observations for children who were able to get vaccinated (5000 yes, 600 no), however when generating the dummy for vaccines, it is capturing missing values for the entire dataset. I have a total of 7000 Wazs observations, as such the dummy is assuming 2000 no (i.e the 600 and the 1400 missing observations) . Is there a way to generate the dummy to only take into consideration the 5600 vaccinated observations.

    gen Vacc =1 if Immu==1
    replace Vacc=0 if mi(Vacc)
    Thanks for your help. I realised this is not a message for this topic. However, I am grateful for your help with this.

    Kind Regards!

    Leave a comment:


  • Andrew Musau
    replied
    reg Wazs i.year i.month i.wave, robust cluster (hv001)
    You have to choose between month, year and wave dummies. You can not have more than one of the three as these are collinear. In short, by including month effects, you have accounted for year effects and wave effects as months are nested in years which in turn are nested in waves.

    Leave a comment:


  • Latoya Sundack
    replied
    Dear Andrew,

    Thanks for the code. However, I got the same results as with the codes I used before, i.e., omitted due to collinearity. Please, do you have any other suggestions?

    Thanks.

    Regards!

    Leave a comment:


  • Andrew Musau
    replied
    Code:
    gen wave= cond(year <=2000, 1, cond(inrange(year, 2001, 2006), 2, 3))

    Leave a comment:


  • Latoya Sundack
    replied
    Dear Andrew,

    Thanks. I have created these in the given round. However, I see what you mean with all observations being equal. Do you have any suggestion on how this can me done?
    Thank you.

    Regards!

    Leave a comment:


  • Andrew Musau
    replied
    gen wave=1 if year <=2000
    gen wave=2 if year <=2006
    gen wave=2 if year <=2014
    Not sure what you are doing here. The second command overwrites the first and the third overwrites the second. But Stata will not allow you to create two or more variables with the same name in the first place, so this cannot be the actual code that you ran. If the last year in the sample is 2014, your last command creates a variable equal to 1 for all observations. This will be collinear with the constant term in the regression.

    Leave a comment:


  • Latoya Sundack
    replied
    Dear Statalist,

    I am working with three rounds or waves of MICS unicef data. It is a cross sectional dataset and I would like to use survey round fixed effects. I have generated a wave variable to identify each of the three waves/rounds.

    gen wave=1 if year <=2000
    gen wave=2 if year <=2006
    gen wave=2 if year <=2014
    Basically, these codes are created a count of the total for each round/wave.

    However, when I ran the regression
    reg Wazs i.year i.month i.wave, robust cluster (hv001)
    , the wave fixed effects are all omitted due to collinearity.

    Thank you for your reply,

    Kind Regards!

    Leave a comment:


  • sophie maene
    replied
    Thank you very much!

    Leave a comment:


  • Carlo Lazzaro
    replied
    Sohie:
    I agree with you: no squared term for age is necessary.

    Leave a comment:


  • sophie maene
    replied
    I actually have a lot more variables in my model but thought that it would become a too long post so I just used the age squared: this is my full model

    Code:
    toprobit shealth c.age##c.age##i.LTCsystem female bmi i.co007_ eduyears_mod chronic_mod i.childhoodhealth eurod ever_smoked smoking i.br010_mod i.sportsoractivities ch001_ ch021_mod partnerinhh gdp i.wave, vce(cluster mergeid_n)
    Code:
     xtoprobit shealth c.age##c.age##i.LTCsystem female bmi i.co007_ eduyears_mod chronic_mod i.childhoodhealth eurod ever_smoked smoking i
    > .br010_mod i.sportsoractivities ch001_ ch021_mod partnerinhh gdp i.wave, vce(cluster mergeid_n)
    
    Fitting comparison model:
    
    Iteration 0:   log likelihood = -68026.187  
    Iteration 1:   log likelihood =  -57366.91  
    Iteration 2:   log likelihood = -57291.882  
    Iteration 3:   log likelihood = -57291.846  
    Iteration 4:   log likelihood = -57291.846  
    
    Refining starting values:
    
    Grid node 0:   log likelihood = -58448.777
    
    Fitting full model:
    
    Iteration 0:   log pseudolikelihood = -58448.777  (not concave)
    Iteration 1:   log pseudolikelihood = -56606.722  
    Iteration 2:   log pseudolikelihood =   -55660.1  
    Iteration 3:   log pseudolikelihood = -55633.542  
    Iteration 4:   log pseudolikelihood = -55633.321  
    Iteration 5:   log pseudolikelihood = -55633.321  
    
    Random-effects ordered probit regression        Number of obs     =     47,521
    Group variable: mergeid_n                       Number of groups  =     26,564
    
    Random effects u_i ~ Gaussian                   Obs per group:
                                                                  min =          1
                                                                  avg =        1.8
                                                                  max =          4
    
    Integration method: mvaghermite                 Integration pts.  =         12
    
                                                    Wald chi2(42)     =   11773.70
    Log pseudolikelihood  = -55633.321              Prob > chi2       =     0.0000
    
                                                (Std. Err. adjusted for 26,564 clusters in mergeid_n)
    -------------------------------------------------------------------------------------------------
                                    |               Robust
                            shealth |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    --------------------------------+----------------------------------------------------------------
                                age |  -.0300641   .0482202    -0.62   0.533     -.124574    .0644458
                                    |
                        c.age#c.age |   .0000321   .0003173     0.10   0.919    -.0005899    .0006541
                                    |
                          LTCsystem |
                         Cluster 2  |   2.331213    2.42739     0.96   0.337    -2.426384    7.088811
                         Cluster 3  |   1.362363   2.425582     0.56   0.574     -3.39169    6.116417
                         Cluster 4  |   2.824523   2.755899     1.02   0.305    -2.576939    8.225986
                                    |
                    LTCsystem#c.age |
                         Cluster 2  |  -.0797573    .064341    -1.24   0.215    -.2058634    .0463488
                         Cluster 3  |  -.0385814   .0640747    -0.60   0.547    -.1641655    .0870027
                         Cluster 4  |  -.0881586   .0731019    -1.21   0.228    -.2314357    .0551185
                                    |
              LTCsystem#c.age#c.age |
                         Cluster 2  |   .0005722   .0004242     1.35   0.177    -.0002592    .0014035
                         Cluster 3  |   .0002343   .0004209     0.56   0.578    -.0005906    .0010593
                         Cluster 4  |   .0006058   .0004824     1.26   0.209    -.0003397    .0015513
                                    |
                             female |   .1099243   .0180642     6.09   0.000      .074519    .1453295
                                bmi |   -.028748   .0018042   -15.93   0.000    -.0322842   -.0252117
                                    |
                             co007_ |
           2. With some difficulty  |   .1223905   .0272272     4.50   0.000     .0690263    .1757548
                  3. Fairly easily  |   .2568748   .0284901     9.02   0.000     .2010352    .3127144
                         4. Easily  |   .3786829   .0299674    12.64   0.000     .3199479    .4374179
                                    |
                       eduyears_mod |   .0350989   .0020617    17.02   0.000      .031058    .0391399
                        chronic_mod |  -.5988879   .0161005   -37.20   0.000    -.6304442   -.5673315
                                    |
                    childhoodhealth |
                            2.Poor  |  -.4891291   .1292088    -3.79   0.000    -.7423737   -.2358845
                            3.Fair  |  -.3603825   .1221198    -2.95   0.003    -.5997329    -.121032
                            4.Good  |  -.2329024   .1201075    -1.94   0.052    -.4683088     .002504
                       5.Very good  |  -.0536764   .1199848    -0.45   0.655    -.2888422    .1814895
                       6.Excellent  |   .1493303   .1202162     1.24   0.214    -.0862892    .3849499
                                    |
                              eurod |  -.2254837   .0036085   -62.49   0.000    -.2325562   -.2184111
                        ever_smoked |  -.0774778    .018341    -4.22   0.000    -.1134255   -.0415302
                            smoking |  -.0065085   .0257136    -0.25   0.800    -.0569063    .0438893
                                    |
                          br010_mod |
         2. less than once a month  |   .1802149   .0240414     7.50   0.000     .1330947    .2273352
          3. once or twice a month  |   .2578486   .0239887    10.75   0.000     .2108316    .3048655
           4. once or twice a week  |   .3625201   .0219147    16.54   0.000     .3195681    .4054721
      5. three or four days a week  |   .4214576    .029104    14.48   0.000     .3644148    .4785005
        6. five or six days a week  |   .4273772   .0399496    10.70   0.000     .3490775    .5056769
               7. almost every day  |   .3480736   .0209166    16.64   0.000     .3070778    .3890693
                                    |
                 sportsoractivities |
     2. One to three times a month  |   .4058166   .0225476    18.00   0.000     .3616241    .4500091
                    3. Once a week  |    .451309   .0194354    23.22   0.000     .4132163    .4894018
           4.More than once a week  |   .5949294   .0168049    35.40   0.000     .5619923    .6278664
                                    |
                             ch001_ |   .0028423   .0079224     0.36   0.720    -.0126853    .0183699
                          ch021_mod |   .0025406   .0031504     0.81   0.420     -.003634    .0087152
                        partnerinhh |  -.1024684   .0179812    -5.70   0.000    -.1377109    -.067226
                                gdp |   .0000116   1.24e-06     9.37   0.000     9.20e-06    .0000141
                                    |
                               wave |
                                 2  |  -.2832655   .0204394   -13.86   0.000     -.323326    -.243205
                                 4  |  -.3138069   .0218063   -14.39   0.000    -.3565466   -.2710673
                                 5  |  -.3327459   .0217811   -15.28   0.000    -.3754361   -.2900557
    --------------------------------+----------------------------------------------------------------
                              /cut1 |  -5.117054   1.827863                       -8.6996   -1.534509
                              /cut2 |  -3.394676   1.828006                     -6.977502    .1881492
                              /cut3 |  -1.677042   1.827917                     -5.259693    1.905609
                              /cut4 |  -.5206001   1.827813                     -4.103048    3.061848
    --------------------------------+----------------------------------------------------------------
                          /sigma2_u |   .7327147   .0234679                      .6881324    .7801854
    -------------------------------------------------------------------------------------------------
    Which gives the following marginsplot

    Code:
     
      margins, at (age=(65(5)95)) nose marginsplot
    Click image for larger version

Name:	Graph.png
Views:	1
Size:	88.8 KB
ID:	1550428 So I guess no age square necessary!

    Leave a comment:


  • Carlo Lazzaro
    replied
    Sophie:
    your results do not show evidence of a squared relationship for -age-.
    That said, only the time dimension of your panel seems to play a role in explaining variations in the regressand.
    That said, what strikes me is that you have -age- only as other predictors.
    Are you sure that you gave a fair and true view of the data generating process?

    Leave a comment:


  • sophie maene
    replied
    Hi, yes sorry here it is!

    Code:
     xtoprobit shealth c.age##c.age i.wave, vce(cluster mergeid_n)
    
    Fitting comparison model:
    
    Iteration 0:   log likelihood = -220787.09  
    Iteration 1:   log likelihood = -216727.71  
    Iteration 2:   log likelihood = -216726.89  
    Iteration 3:   log likelihood = -216726.89  
    
    Refining starting values:
    
    Grid node 0:   log likelihood = -209010.45
    
    Fitting full model:
    
    Iteration 0:   log pseudolikelihood = -209010.45  
    Iteration 1:   log pseudolikelihood = -200663.47  
    Iteration 2:   log pseudolikelihood =  -196345.3  
    Iteration 3:   log pseudolikelihood = -195566.04  
    Iteration 4:   log pseudolikelihood = -195532.62  
    Iteration 5:   log pseudolikelihood = -195532.47  
    Iteration 6:   log pseudolikelihood = -195532.47  
    
    Random-effects ordered probit regression        Number of obs     =    154,711
    Group variable: mergeid_n                       Number of groups  =     62,399
    
    Random effects u_i ~ Gaussian                   Obs per group:
                                                                  min =          1
                                                                  avg =        2.5
                                                                  max =          7
    
    Integration method: mvaghermite                 Integration pts.  =         12
    
                                                    Wald chi2(8)      =    6777.25
    Log pseudolikelihood  = -195532.47              Prob > chi2       =     0.0000
    
                             (Std. Err. adjusted for 62,399 clusters in mergeid_n)
    ------------------------------------------------------------------------------
                 |               Robust
         shealth |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |  -.0634849   .0137425    -4.62   0.000    -.0904197   -.0365502
                 |
     c.age#c.age |   .0000222   .0000901     0.25   0.805    -.0001543    .0001987
                 |
            wave |
              2  |  -.1946302    .014936   -13.03   0.000    -.2239043   -.1653562
              3  |  -.3675597   .0161868   -22.71   0.000    -.3992852   -.3358342
              4  |   -.243044   .0154814   -15.70   0.000     -.273387    -.212701
              5  |  -.1574564   .0153565   -10.25   0.000    -.1875545   -.1273583
              6  |  -.1856671   .0154187   -12.04   0.000    -.2158873    -.155447
              7  |  -.2027917   .0157322   -12.89   0.000    -.2336263   -.1719571
    -------------+----------------------------------------------------------------
           /cut1 |  -6.501648   .5206987                     -7.522199   -5.481098
           /cut2 |  -4.913978   .5207371                     -5.934604   -3.893352
           /cut3 |  -3.246265   .5207259                     -4.266869   -2.225661
           /cut4 |  -2.076403   .5207411                     -3.097037    -1.05577
    -------------+----------------------------------------------------------------
       /sigma2_u |   1.575051   .0200396                       1.53626    1.614822
    ------------------------------------------------------------------------------

    Leave a comment:

Working...
X