Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quadratic term omitted when using # operator

    Hello all,

    I am running xtlogit command in Stata 14.2 and my main variable of interest includes a quadratic term, which I included based on theory and a utest (ssc install utest) confirming the presence of a U-shape.

    I am working with an unbalanced panel with 15,165 observations (see example data below). The panel variable is id_ocad and the time variable is semester

    My concern is that when I run the command using the # operator to generate the quadratic, the coefficient on the quadratic term is reported as 0 and the standard error is omitted in the output table.

    Code:
     xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap c.months_election##c.months_election i.semester, fe vce(oim)
    Code:
    . xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap c.months_election##c.months_election i.semester, fe vce(oim)
    note: c.months_election#c.months_election omitted because of collinearity
    note: 12.semester omitted because of collinearity
    note: multiple positive outcomes within groups encountered.
    note: 139 groups (622 obs) dropped because of all positive or
          all negative outcomes.
    
    Iteration 0:   log likelihood = -2597.9842  
    Iteration 1:   log likelihood = -2448.5756  
    Iteration 2:   log likelihood = -2437.0854  
    Iteration 3:   log likelihood = -2437.0664  
    Iteration 4:   log likelihood = -2437.0664  
    
    Conditional fixed-effects logistic regression   Number of obs     =      6,652
    Group variable: id_ocad                         Number of groups  =        796
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =        8.4
                                                                  max =         10
    
                                                    LR chi2(16)       =    1162.32
    Log likelihood  = -2437.0664                    Prob > chi2       =     0.0000
    
    -----------------------------------------------------------------------------------------------------
                           prob_project |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ------------------------------------+----------------------------------------------------------------
                      n_projects_cumlag |  -.2028613   .0168779   -12.02   0.000    -.2359414   -.1697811
                        ln_densidad_pob |   11.82582   58.98681     0.20   0.841    -103.7862    127.4378
                           ln_poblacion |  -7.168727   58.99419    -0.12   0.903    -122.7952    108.4578
                                        |
                    ln_indice_desempeno |
                                    L2. |   .3498155     .18455     1.90   0.058    -.0118958    .7115267
                                        |
                           ln_tasa_mort |
                                    L2. |  -.0167774   .0661643    -0.25   0.800     -.146457    .1129022
                                        |
                             ln_balance |
                                    L2. |  -3.038936   2.949626    -1.03   0.303    -8.820098    2.742225
                                        |
                   ln_regalias_efec_cap |   .0736869   .0083789     8.79   0.000     .0572645    .0901094
                        months_election |  -.3035517   .0288018   -10.54   0.000    -.3600022   -.2471011
                                        |
    c.months_election#c.months_election |          0  (omitted)
                                        |
                               semester |
                                     4  |  -.5233661   .1549722    -3.38   0.001     -.827106   -.2196262
                                     5  |   -3.47403    .295229   -11.77   0.000    -4.052668   -2.895392
                                     6  |  -4.443102   .4478829    -9.92   0.000    -5.320936   -3.565268
                                     7  |  -6.573392   .6058894   -10.85   0.000    -7.760914   -5.385871
                                     8  |  -7.220639   .7680139    -9.40   0.000    -8.725919    -5.71536
                                     9  |   3.111472    .491074     6.34   0.000     2.148984    4.073959
                                    10  |   2.046095   .3227467     6.34   0.000     1.413523    2.678667
                                    11  |   .6660429   .1767568     3.77   0.000     .3196059     1.01248
                                    12  |          0  (omitted)
    -----------------------------------------------------------------------------------------------------
    However, when I manually generate the quadratic term and include it in the (otherwise) identical regression, the coefficient is reported as statistically significant and non-zero, and a utest confirms the presence of a U-shape, as mentioned above.

    I imagine there is a reason for the different outputs, which may tell me something important about my data and the appropriateness of the model I am running.

    In addition, as I would like to use margins after estimation, I would need to use the # operator to generate the quadratic term if possible.

    Thank you in advance for any suggestions.

    Best regards,

    Theo

    Code:
    xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap months_election months_election_sq i.semester, fe vce(oim)
    utest months_election months_election_sq, prefix ( prob_project )
    Code:
    . xtlogit prob_project n_projects_cumlag ln_densidad_pob ln_poblacion l2.ln_indice_desempeno l2.ln_tasa_mort l2.ln_balance ln_regalias_efec_cap months_election months_election_sq i.semester, fe vce(oim)
    note: 10.semester omitted because of collinearity
    note: 12.semester omitted because of collinearity
    note: multiple positive outcomes within groups encountered.
    note: 139 groups (622 obs) dropped because of all positive or
          all negative outcomes.
    
    Iteration 0:   log likelihood = -2597.9842  
    Iteration 1:   log likelihood = -2448.5756  
    Iteration 2:   log likelihood = -2437.0854  
    Iteration 3:   log likelihood = -2437.0664  
    Iteration 4:   log likelihood = -2437.0664  
    
    Conditional fixed-effects logistic regression   Number of obs     =      6,652
    Group variable: id_ocad                         Number of groups  =        796
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =        8.4
                                                                  max =         10
    
                                                    LR chi2(16)       =    1162.32
    Log likelihood  = -2437.0664                    Prob > chi2       =     0.0000
    
    --------------------------------------------------------------------------------------
            prob_project |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------------+----------------------------------------------------------------
       n_projects_cumlag |  -.2028613   .0168779   -12.02   0.000    -.2359414   -.1697811
         ln_densidad_pob |   11.82582   58.98681     0.20   0.841    -103.7862    127.4378
            ln_poblacion |  -7.168727   58.99419    -0.12   0.903    -122.7952    108.4578
                         |
     ln_indice_desempeno |
                     L2. |   .3498155     .18455     1.90   0.058    -.0118958    .7115267
                         |
            ln_tasa_mort |
                     L2. |  -.0167774   .0661643    -0.25   0.800     -.146457    .1129022
                         |
              ln_balance |
                     L2. |  -3.038936   2.949626    -1.03   0.303    -8.820098    2.742225
                         |
    ln_regalias_efec_cap |   .0736869   .0083789     8.79   0.000     .0572645    .0901094
         months_election |  -1.838123   .2689011    -6.84   0.000    -2.365159   -1.311087
      months_election_sq |    .028418   .0044826     6.34   0.000     .0196323    .0372037
                         |
                semester |
                      4  |  -.5233661   .1549722    -3.38   0.001     -.827106   -.2196262
                      5  |  -5.520125   .5898451    -9.36   0.000      -6.6762    -4.36405
                      6  |  -10.58139   1.377183    -7.68   0.000    -13.28062   -7.882158
                      7  |  -18.84996   2.491531    -7.57   0.000    -23.73327   -13.96665
                      8  |  -27.68159   3.932863    -7.04   0.000    -35.38986   -19.97332
                      9  |  -3.026814   .5554395    -5.45   0.000    -4.115455   -1.938172
                     10  |          0  (omitted)
                     11  |   .6660429   .1767568     3.77   0.000     .3196059     1.01248
                     12  |          0  (omitted)
    --------------------------------------------------------------------------------------
    
    . utest months_election months_election_sq, prefix (prob_project)
    (983 missing values generated)
    (1,996 missing values generated)
    
    Specification: f(x)=x^2
    Extreme point:  32.34084
    
    Test:
         H1: U shape
     vs. H0: Monotone or Inverse U shape
    
    -------------------------------------------------
                     |   Lower bound      Upper bound
    -----------------+-------------------------------
    Interval         |           0               42
    Slope            |   -1.838123          .548988
    t-value          |   -6.835684         5.063425
    P>|t|            |    4.44e-12         2.11e-07
    -------------------------------------------------
    
    Overall test of presence of a U shape:
         t-value =      5.06
         P>|t|   =  2.11e-07
    Code:
    input float(prob_project n_projects_cumlag ln_densidad_pob ln_poblacion ln_indice_desempeno ln_tasa_mort ln_balance ln_regalias_efec_cap months_election months_election_sq semester) long id_ocad
    0   0  3.945458  9.845434   4.21763  2.961141   13.6579   11.47631 42 1764  1     0
    0   0  3.945458  9.845434   4.21763  2.961141   13.6579   11.47631 36 1296  2     0
    0   0 3.9661324  9.865941 4.2298265  2.947067 13.654828  10.629907 30  900  3     0
    0   0 3.9661324  9.865941 4.2298265  2.947067 13.654828  10.629907 24  576  4     0
    1   0 3.9862025  9.886138 4.3641763  3.884652 13.655166   12.31328 18  324  5     0
    0   1 3.9862025  9.886138 4.3641763  3.884652 13.655166   12.31328 12  144  6     0
    0   1 4.0066056  9.906583 4.0745883  1.541159 13.651732  12.624626  6   36  7     0
    0   1 4.0066056  9.906583 4.0745883  1.541159 13.651732  12.624626  0    0  8     0
    0   1  4.027492  9.927351 4.1196294  3.016025 13.637353    11.8977 42 1764  9     0
    0   1  4.027492  9.927351 4.1196294  3.016025 13.637353    11.8977 36 1296 10     0
    0   1  4.047253  9.947169  4.064282         . 13.651488   11.55211 30  900 11     0
    0   1  4.047253  9.947169  4.064282         . 13.651488   11.55211 24  576 12     0
    0   1  4.067316   9.96726         .         .         .          . 18  324 13     0
    0   1  4.067316   9.96726         .         .         .          . 12  144 14     0
    0   0  3.902377 12.139313  3.984617  2.933325  13.65175  11.759857 42 1764  1 60092
    1   0  3.902377 12.139313  3.984617  2.933325  13.65175  11.759857 36 1296  2 60092
    1  17  3.924149  12.16117   4.34484  3.034472 13.619888  11.960607 30  900  3 60092
    1  20  3.924149  12.16117   4.34484  3.034472 13.619888  11.960607 24  576  4 60092
    1  42  3.946038   12.1829  3.397157 2.9343886  13.68308  11.899978 18  324  5 60092
    1  57  3.946038   12.1829  3.397157 2.9343886  13.68308  11.899978 12  144  6 60092
    0  89  3.967458 12.204366 4.2517734  2.933325 13.644894   7.416076  6   36  7 60092
    1  89  3.967458 12.204366 4.2517734  2.933325 13.644894   7.416076  0    0  8 60092
    0  94  3.988799 12.225733 4.3862324 2.8673306 13.640287  11.284286 42 1764  9 60092
    1  94  3.988799 12.225733 4.3862324 2.8673306 13.640287  11.284286 36 1296 10 60092
    0 104 4.0098753  12.24682  4.229876         . 13.642162  11.316903 30  900 11 60092
    1 104 4.0098753  12.24682  4.229876         . 13.642162  11.316903 24  576 12 60092
    1 105 4.0306945   12.2676         .         .         .          . 18  324 13 60092
    1 109 4.0306945   12.2676         .         .         .          . 12  144 14 60092
    0   0  4.325456  9.145802  3.890944 2.3702438  13.65227  12.265366 42 1764  1 60093
    1   0  4.325456  9.145802  3.890944 2.3702438  13.65227  12.265366 36 1296  2 60093
    1   6 4.3317857  9.152076  3.598994 3.3991954 13.653942   13.31222 30  900  3 60093
    1   7 4.3317857  9.152076  3.598994 3.3991954 13.653942   13.31222 24  576  4 60093
    0  11  4.338989  9.159258  4.244644  2.519308 13.654224   12.26798 18  324  5 60093
    1  11  4.338989  9.159258  4.244644  2.519308 13.654224   12.26798 12  144  6 60093
    1  14  4.344195  9.164506  4.115339         . 13.645218  12.139977  6   36  7 60093
    1  17  4.344195  9.164506  4.115339         . 13.645218  12.139977  0    0  8 60093
    0  18  4.351052 9.1713915 4.0765953 3.8811514  13.65453  11.402854 42 1764  9 60093
    1  18  4.351052 9.1713915 4.0765953 3.8811514  13.65453  11.402854 36 1296 10 60093
    0  21 4.3574777  9.177817 4.1624994         . 13.653556   11.16387 30  900 11 60093
    1  21 4.3574777  9.177817 4.1624994         . 13.653556   11.16387 24  576 12 60093
    1  23  4.364372  9.184612         .         .         .          . 18  324 13 60093
    1  24  4.364372  9.184612         .         .         .          . 12  144 14 60093
    0   0  4.148517 10.408164  4.179895  1.978239 13.654896  10.157875 42 1764  1 60094
    1   0  4.148517 10.408164  4.179895  1.978239 13.654896  10.157875 36 1296  2 60094
    0   2   4.14091  10.40053  4.229979 2.0399208 13.653278   11.41471 30  900  3 60094
    1   2   4.14091  10.40053  4.229979 2.0399208 13.653278   11.41471 24  576  4 60094
    0   5  4.133405 10.392926   4.19092  3.098289 13.652154   10.42813 18  324  5 60094
    1   5  4.133405 10.392926   4.19092  3.098289 13.652154   10.42813 12  144  6 60094
    1   6   4.12552  10.38508  4.190453  3.221672 13.651053  10.259785  6   36  7 60094
    1   8   4.12552  10.38508  4.190453  3.221672 13.651053  10.259785  0    0  8 60094
    0  10 4.1174097 10.377016  4.222297 3.3697066 13.654318 -1.7917595 42 1764  9 60094
    0  10 4.1174097 10.377016  4.222297 3.3697066 13.654318 -1.7917595 36 1296 10 60094
    0  10 4.1097255 10.369295  4.272544         .  13.65218   10.17851 30  900 11 60094
    1  10 4.1097255 10.369295  4.272544         .  13.65218   10.17851 24  576 12 60094
    0  11 4.1014857  10.36107         .         .         .          . 18  324 13 60094
    1  11 4.1014857  10.36107         .         .         .          . 12  144 14 60094
    end

  • #2
    Theodore:
    when you use your first code (that makes Stata aware of both a linear and a squared term for -months_election- in the right-hand side of your regression equation), Stata warns you that:
    Code:
    note: c.months_election#c.months_election omitted because of collinearity
    In your second code you create the interaction by hand, hence Stata does not know that the interaction is actually the squared term of -months_election-.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo,

      Thank you for your quick response and for pointing out that the issue is collinearity of the quadratic and non-quadratic terms.

      Conceptually, does the collinearity of the quadratic term of -months_election- (I assume with the linear term of -months_election-) suggest anything about the appropriateness of a quadratic model in this case?

      Best,

      Theo

      Comment


      • #4
        Theodore:
        referring to your first code, there's no evidence of a squared term for -months_election-.
        I would say that the linear term is enough; that said I would plug it in as a categorical variable (ie, -i.months_election-).
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Hi again, Carlo,

          Thank you for those suggestions; I will implement them and check the results.

          If it is straightforward to explain, what in particular leads you to suggest including -months_election- as a factor variable?

          For context, the variable measures the number of months prior to elections; because my data is reported on a semester basis, -months_election- only takes the values 0, 6, 12, 18, etc, where 0 indicates that it is the semester in which elections take place.

          Thank you again for your responses.

          Best,
          Theo

          Comment


          • #6
            Theodore:
            thanks for clarifying.
            I assumed that -months_election- was actually expressed in months, not semesters.
            That said, you would probably be better off with -c.months_election-.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X