Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The "omitted" results in interaction between two dummy variables

    Hi there,

    I'm running a logistic regression and put two interaction terms between dummies in the model. The code is like (just put a part of variables in the example):
    Code:
    logit daily_f feduy ocpfm_dum2 ocpfm_dum3  livewothr_c pintim ///
    ocpfm_dum2#livewothr_c ocpfm_dum3#livewothr_c  ///
    ,or
    livewothr_c indicates whether other adult relatives live in the household other than parents (yes or no)
    ocpfm_dum2 refers to whether the individual is in a specific category of the occupational status (yes or no)
    ocpfm_dum3 is similar to ocpfm_dum2
    they are three dummy variables.

    Then I ran the command. The results about the interaction terms are:
    daily_f | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
    ocpfm_dum2#livewothr_c |
    0 1 | 1.377075 .2156025 2.04 0.041 1.013185 1.871657
    1 0 | 1 (omitted)
    1 1 | 1 (omitted)
    |
    livewothr_c#ocpfm_dum3 |
    0 1 | 1.967462 .4669052 2.85 0.004 1.235677 3.132622
    1 0 | 1 (omitted)
    1 1 | 1 (omitted)
    |
    I'm confused about why 1 0 and 1 1 are omitted.Should I check the data? Or the command is wrong? Or it is normal but I did not understand it?

    Thank you so much.



  • #2
    Vincent:
    first, your code can be written in a more efficient way:
    Code:
    logit daily_f feduy pintim ocpfm_dum2##livewothr_c ocpfm_dum3##livewothr_c, or
    It remains to be told to interested lister whether Stata reported the reason of omission (es collinearity; perfect prediction) above the -logit- outcome table.
    As a (relevant) aside, please use CODE delimiters (just click the # toggle available from the Advanced editor bar) to share what you typed and what Stata gave you back (as per FAQ). Thanks.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      If you follow the FAQ (http://www.statalist.org/forums/help) and actually provide a slice of sample data, I may be able to dig into it more. But overall, it's perhaps caused by two problems: fitting manually created dummies (which is not wrong per se, but just prone to confusion) and possible mislabeling.

      A 2x2 interaction should estimate four means: 0-0, 0-1, 1-0, and 1-1. The point 0-0 is picked up by intercept so that's given. You fitted ocpfm_dum2, livewothr_c, and ocpfm_dum2#livewothr_c, which are totally three estimates, so the model should have captured all of them. Same goes for the dum3; I wont' repeat that assessment.

      Left as numeric, the interaction ocpfm_dum2#livewothr_c should pick up when the combination 1-1. But the output shows only 0-1. My guess is that liveworthr_c was mislabeled (aka 0 was applied a label "1" and 1 was applied a label "0"). Use -browse- to check if the data inside that column is in blue color (or use -describe- to see if it has a labeling scheme.)

      Back to manually created dummies. You may stop doing that, use -help fvvarlist- to learn more about how to model interactions without creating dummies. Simply, your model can be modified into:

      Code:
      logit daily_f feduy pintim i.ocpfm_dum2##i.livewothr_c i.ocpfm_dum3##i.livewothr_c, or
      or, if ocpfm is a 3-level variable to begin with, use the original 3-level variable (I'm assuming it's called ocpfm):

      Code:
      logit daily_f feduy pintim i.ocpfm##i.livewothr_c, or
      and the output will make a lot more sense.
      Last edited by Ken Chui; 27 Sep 2021, 07:52.

      Comment


      • #4
        Dear Carlo,

        First, thank you for reminding me about the use of delimiters.I'll use them in the appropriate way next time.

        Then, yes, Stata did report the reason:
        Code:
        1.ocpfm_dum2#0.livewothr_c omitted because of collinearity
        1.ocpfm_dum2#1.livewothr_c omitted because of collinearity
        it's because of the collinearity.

        and then, I rewrite the code just like what you suggested (thanks a lot):
        Code:
         logit daily_f feduy pintim ocpfm_dum2##livewothr_c ocpfm_dum3##livewothr_c, or
        It works. The results are:
        Code:
                          daily_f | Odds Ratio    Std. Err.       z         P>|z|     [95% Conf. Interval]
             1.ocpfm_dum2 |   1.481956   .1396938     4.17   0.000     1.231965    1.782676
              1.livewothr_c |   .9404549   .1329639    -0.43   0.664     .7128415    1.240746
                              
        ocpfm_dum2#livewothr_c |
                           1 1  |    .726177   .1136943    -2.04   0.041     .5342858    .986986
                    1.ocpfm_dum3 |   2.053498    .323048     4.57   0.000     1.508644    2.795129
                              
        ocpfm_dum3#livewothr_c |
                          1 1  |   .5082689    .120619    -2.85   0.004     .3192214    .8092732
        So....why was there a collinearity before? because I put the interaction terms (in a wrong way) and the original variables in the model at the same time? how to know who has the collinearity with the interaction term when values are 1 0 and 1 1?

        Thank you so much.

        Comment


        • #5
          Dear Ken,

          Thanks for your suggestions.

          First, the reason for the "omitted" is collinearity not the mismatch between the label and the value.

          Second, yes, I did create the interaction terms in an inappropriate way.
          I tried the code you suggested:
          Code:
           
           logit daily_f feduy pintim i.ocpfm##i.livewothr_c, or
          and the results are :
          Code:
              1.livewothr_c |   .9404549   .1329639    -0.43   0.664     .7128415    1.240746
                            |
                      ocpfm |
                         1  |   1.481956   .1396938     4.17   0.000     1.231965    1.782676
                         2  |   2.053498    .323048     4.57   0.000     1.508644    2.795129
                            |
          livewothr_c#ocpfm |
                       1 1  |    .726177   .1136943    -2.04   0.041     .5342858    .9869867
                       1 2  |   .5082689    .120619    -2.85   0.004     .3192214    .8092732
                            |
                      _cons |    .287379   .2123928    -1.69   0.092     .0675085    1.223352
          -----------------------------------------------------------------------------------
          The results are same as the results when using dummy variables. And I think recoding the categorical variable into dummy variables is necessary.

          Thank you so much.

          Comment


          • #6
            Vincent:
            it is difficult to reply without taking a look at an excerpt/example of your data, that you can easily share via -dataex-.
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              Dear Carlo,

              Sorry.
              Below is an example of the data, but does not include all variables because input statement exceeds linesize limit (does it matter?).
              Will it help?
              Or would you mind sharing what you would do if it were you?

              Thank you very much.

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float(daily_f feduy) byte(pedufam_dum2 pedufam_dum3 ocpfm_dum2 ocpfm_dum3 livewp_dum2 livewp_dum3 livewp_dum4) float(livewothr_c pintim)
              1  9 1 0 1 0 0 0 0 0 1
              1 12 1 0 1 0 0 0 0 0 1
              0 16 1 0 1 0 0 0 0 1 0
              0 16 0 0 . . 0 0 0 0 1
              1 16 0 0 0 0 0 0 0 0 1
              0 15 0 1 0 1 0 0 0 1 1
              1 12 0 1 0 0 0 0 0 0 1
              1 16 1 0 . . 0 0 0 0 1
              1  8 1 0 0 0 0 0 0 0 1
              0 16 1 0 1 0 0 0 0 0 1
              0 15 0 0 1 0 0 0 0 0 1
              0  8 0 1 . . 0 1 0 0 0
              1  9 1 0 1 0 0 0 0 0 1
              1 12 0 1 . . 0 0 0 0 1
              1  9 0 1 1 0 0 0 0 1 1
              1  8 0 1 1 0 0 0 0 1 1
              1 15 1 0 0 0 0 0 0 0 1
              0 16 1 0 1 0 0 1 0 0 0
              1  9 1 0 1 0 0 0 0 1 1
              1 16 1 0 0 1 0 0 0 0 1
              1  9 0 0 1 0 0 0 0 0 1
              0  9 0 0 . . 0 0 0 1 1
              1 16 0 0 . . 0 0 0 0 1
              1  8 1 0 1 0 0 0 0 0 1
              0 15 1 0 1 0 0 0 1 0 0
              0  9 1 0 1 0 0 1 0 1 0
              1 19 0 0 1 0 0 0 0 1 1
              1  8 0 1 1 0 0 0 0 0 1
              0 12 0 0 1 0 0 0 0 1 1
              1  8 0 1 1 0 0 0 0 1 1
              0 15 0 0 1 0 0 0 0 0 1
              1 12 0 0 1 0 0 0 0 0 1
              1 15 1 0 1 0 0 0 0 0 1
              1 16 0 0 0 1 0 0 0 1 1
              1  8 1 0 0 1 0 0 0 1 1
              1 16 0 0 1 0 0 0 0 0 1
              1  9 0 1 0 1 0 0 0 0 1
              1  8 1 0 1 0 0 0 0 0 1
              1 15 1 0 0 1 0 0 0 0 1
              1 15 0 0 1 0 0 0 0 0 1
              0 16 1 0 1 0 0 1 0 1 0
              1 15 1 0 0 1 0 0 0 0 0
              0 15 0 0 0 1 0 1 0 0 1
              0 16 0 0 0 0 0 0 0 0 1
              0 19 0 0 1 0 0 0 0 0 1
              0 19 0 0 . . 0 1 0 1 0
              1 19 1 0 1 0 0 0 0 0 1
              0 15 0 0 0 1 0 0 0 0 1
              1  8 1 0 1 0 0 0 0 0 1
              0 15 1 0 0 0 0 0 0 0 1
              1 16 1 0 1 0 0 0 0 0 1
              0 15 0 0 0 0 0 0 0 1 1
              1 16 1 0 1 0 0 0 0 0 1
              1  8 0 1 1 0 0 0 0 0 1
              1  8 0 1 0 1 0 0 0 0 1
              0  9 0 1 . . 0 1 0 1 1
              0  8 0 1 . . 0 1 0 1 0
              1 16 1 0 1 0 0 0 0 0 1
              0 16 0 1 1 0 0 1 0 0 1
              0  8 0 1 1 0 0 0 0 1 1
              0 15 0 0 1 0 0 0 0 0 1
              1 16 1 0 0 0 0 0 0 0 1
              0 19 0 0 1 0 0 0 0 0 0
              1 16 1 0 1 0 0 0 0 0 1
              0  6 1 0 1 0 0 0 0 0 1
              1 19 0 0 0 0 0 0 0 0 1
              0 15 0 1 0 0 0 0 0 0 1
              0 16 1 0 1 0 0 0 0 0 1
              1 15 0 0 0 1 0 0 0 1 1
              1 15 0 1 0 1 0 0 0 0 1
              1 16 1 0 1 0 0 0 0 0 1
              0 16 0 0 1 0 0 0 0 1 1
              0  9 1 0 0 0 0 0 0 1 1
              1 16 0 0 0 0 0 0 0 0 1
              1 11 0 1 0 1 0 0 0 0 1
              1 15 1 0 0 0 0 0 0 1 1
              1  9 1 0 1 0 0 0 0 0 1
              1 16 0 0 0 0 0 1 0 0 1
              1 15 0 1 0 1 0 0 0 1 1
              0 16 1 0 1 0 0 0 0 0 1
              1 19 0 0 1 0 0 0 0 0 1
              0 19 1 0 0 0 0 0 0 0 1
              0 15 0 0 0 0 0 0 0 0 1
              1 15 1 0 0 1 0 0 0 0 1
              1 16 0 1 1 0 0 0 0 0 1
              1  9 1 0 1 0 0 0 0 0 1
              1 16 1 0 . . 0 0 0 0 1
              1 15 1 0 1 0 0 0 0 1 1
              0 15 1 0 0 1 0 0 0 1 1
              0 16 0 0 1 0 0 0 0 0 1
              1 19 1 0 . . 0 0 0 0 1
              0 16 1 0 1 0 0 0 0 0 1
              1  9 0 0 1 0 0 0 0 0 1
              1  9 1 0 1 0 0 0 0 0 1
              0  6 1 0 0 0 0 0 0 0 1
              0  9 1 0 1 0 0 0 0 0 1
              1 12 0 0 . . 0 0 0 0 1
              0  9 1 0 1 0 0 0 0 0 1
              1  9 0 0 1 0 0 0 0 0 1
              1  9 1 0 1 0 0 0 0 0 1
              end

              Comment


              • #8
                Vincent:
                yes, it depends on the way you code the interactions as in your first code, for example:
                Code:
                1.ocpfm_dum2#0.livewothr_c=ocpfm_dum2
                As you can also see comparing the two models:
                Code:
                . logit daily_f feduy pintim ocpfm_dum2##livewothr_c ocpfm_dum3##livewothr_c, or
                
                Iteration 0:   log likelihood = -58.704243 
                Iteration 1:   log likelihood = -52.100632 
                Iteration 2:   log likelihood = -52.058353 
                Iteration 3:   log likelihood = -52.058231 
                Iteration 4:   log likelihood = -52.058231 
                
                Logistic regression                             Number of obs     =         88
                                                                LR chi2(7)        =      13.29
                                                                Prob > chi2       =     0.0653
                Log likelihood = -52.058231                     Pseudo R2         =     0.1132
                
                ----------------------------------------------------------------------------------------
                               daily_f | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -----------------------+----------------------------------------------------------------
                                 feduy |   .9045513   .0614372    -1.48   0.140     .7918073    1.033349
                                pintim |   11.01572   12.93973     2.04   0.041     1.101896    110.1249
                          1.ocpfm_dum2 |   1.261524   .8224156     0.36   0.722     .3515429    4.527023
                         1.livewothr_c |   .3123884   .4247131    -0.86   0.392     .0217485     4.48705
                                       |
                ocpfm_dum2#livewothr_c |
                                  1 1  |   2.383191   3.679699     0.56   0.574     .1155807     49.1397
                                       |
                          1.ocpfm_dum3 |   3.892826   4.073845     1.30   0.194       .50059    30.27247
                                       |
                ocpfm_dum3#livewothr_c |
                                  1 1  |   1.166291   2.154166     0.08   0.934     .0312334     43.5507
                                       |
                                 _cons |   .5286825   .9092787    -0.37   0.711     .0181642    15.38767
                ----------------------------------------------------------------------------------------
                Note: _cons estimates baseline odds.
                
                . logit daily_f feduy pintim livewothr_c ocpfm_dum2 ocpfm_dum3 ocpfm_dum2#livewothr_c ocpfm_dum3#livewothr_c, or
                
                note: 1.ocpfm_dum2#0.livewothr_c omitted because of collinearity
                note: 1.ocpfm_dum2#1.livewothr_c omitted because of collinearity
                note: 1.ocpfm_dum3#0.livewothr_c omitted because of collinearity
                note: 1.ocpfm_dum3#1.livewothr_c omitted because of collinearity
                Iteration 0:   log likelihood = -58.704243 
                Iteration 1:   log likelihood = -52.100632 
                Iteration 2:   log likelihood = -52.058353 
                Iteration 3:   log likelihood = -52.058231 
                Iteration 4:   log likelihood = -52.058231 
                
                Logistic regression                             Number of obs     =         88
                                                                LR chi2(7)        =      13.29
                                                                Prob > chi2       =     0.0653
                Log likelihood = -52.058231                     Pseudo R2         =     0.1132
                
                ----------------------------------------------------------------------------------------
                               daily_f | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                -----------------------+----------------------------------------------------------------
                                 feduy |   .9045513   .0614372    -1.48   0.140     .7918073    1.033349
                                pintim |   11.01572   12.93973     2.04   0.041     1.101896    110.1249
                           livewothr_c |    .868282   1.713673    -0.07   0.943     .0181425    41.55504
                            ocpfm_dum2 |   1.261524   .8224156     0.36   0.722     .3515429    4.527023
                            ocpfm_dum3 |   3.892826   4.073845     1.30   0.194       .50059    30.27247
                                       |
                ocpfm_dum2#livewothr_c |
                                  0 1  |   .4196055   .6478799    -0.56   0.574     .0203501    8.651965
                                  1 0  |          1  (omitted)
                                  1 1  |          1  (omitted)
                                       |
                ocpfm_dum3#livewothr_c |
                                  0 1  |   .8574187   1.583671    -0.08   0.934     .0229617    32.01703
                                  1 0  |          1  (omitted)
                                  1 1  |          1  (omitted)
                                       |
                                 _cons |   .5286825   .9092787    -0.37   0.711     .0181642    15.38767
                ----------------------------------------------------------------------------------------
                Note: _cons estimates baseline odds.
                
                .
                Kind regards,
                Carlo
                (Stata 18.0 SE)

                Comment


                • #9
                  Thank you Carlo,

                  But one more question, what's the difference between the two codes:
                  Code:
                   
                   logit daily_f feduy ocpfm_dum2 ocpfm_dum3  livewothr_c pintim ocpfm_dum2#livewothr_c ocpfm_dum3#livewothr_c,or
                  Code:
                   
                   logit daily_f feduy pintim ocpfm_dum2##livewothr_c ocpfm_dum3##livewothr_c, or
                  In the first one, I include the original variables and interaction terms (using #) into the model. # will just put the interaction term into the model, right?
                  In the second one, ## will put the original variable and interaction term into the model at the same time.

                  they look the same but lead to different results.

                  Comment


                  • #10
                    Vincent:
                    the difference in some of the coefficients reported by the two codes is due to the different terms included in the interactions.
                    take a look at the following elaboration on the orevious codes:
                    Code:
                    ogit daily_f feduy pintim ocpfm_dum2##livewothr_c ocpfm_dum3##livewothr_c, or
                    predict pr1, p
                    logit daily_f feduy pintim livewothr_c ocpfm_dum2 ocpfm_dum3 ocpfm_dum2#livewothr_c ocpfm_dum3#livewothr_c, or
                    predict pr2, p
                    list pr1 pr2 in 1/10
                    . list pr1 pr2 in 1/10
                    
                         +---------------------+
                         |      pr1        pr2 |
                         |---------------------|
                      1. | .7486511   .7486511 |
                      2. | .6879349   .6879349 |
                      3. | .0906954   .0906954 |
                      4. |        .          . |
                      5. | .5391439   .5391439 |
                         |---------------------|
                      6. | .6471807   .6471807 |
                      7. | .6360271   .6360271 |
                      8. |        .          . |
                      9. |  .723007    .723007 |
                     10. | .5960943   .5960943 |
                         +---------------------+
                    
                    .
                    Last edited by Carlo Lazzaro; 29 Sep 2021, 05:11.
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment


                    • #11
                      Thanks Carlo,

                      Sorry, I still have a question and thanks for your patience.

                      The collinearity must be between a and b, but who is a and who is b?
                      ocpfm_dum2#livewothr_c is collinear with ocpfm_dum2 when the values are 1 0 and 1 1?

                      Code:
                       . logit daily_f feduy pintim livewothr_c ocpfm_dum2 ocpfm_dum3 ocpfm_dum2#livewothr_c ocpfm_dum3#livewothr_c, or  
                      note: 1.ocpfm_dum2#0.livewothr_c omitted because of collinearity
                      note: 1.ocpfm_dum2#1.livewothr_c omitted because of collinearity
                      note: 1.ocpfm_dum3#0.livewothr_c omitted because of collinearity
                      note: 1.ocpfm_dum3#1.livewothr_c omitted because of collinearity
                      Iteration 0:   log likelihood = -58.704243  
                      Iteration 1:   log likelihood = -52.100632  
                      Iteration 2:   log likelihood = -52.058353  
                      Iteration 3:   log likelihood = -52.058231  
                      Iteration 4:   log likelihood = -52.058231  
                      Logistic regression                            
                      Number of obs     =         88                                                
                      LR chi2(7)        =      13.29                                                
                      Prob > chi2       =     0.0653
                      Log likelihood = -52.058231                    
                      Pseudo R2         =     0.1132  
                      ----------------------------------------------------------------------------------------                
                      daily_f | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -----------------------+----------------------------------------------------------------                  
                      feduy |   .9045513   .0614372    -1.48   0.140     .7918073    1.033349                
                      pintim |   11.01572   12.93973     2.04   0.041     1.101896    110.1249            
                      livewothr_c |    .868282   1.713673    -0.07   0.943     .0181425    41.55504            
                      ocpfm_dum2 |   1.261524   .8224156     0.36   0.722     .3515429    4.527023            
                      ocpfm_dum3 |   3.892826   4.073845     1.30   0.194       .50059    30.27247                        
                      
                      ocpfm_dum2#livewothr_c |                  
                                     0 1  |   .4196055   .6478799    -0.56   0.574     .0203501    8.651965                  
                                     1 0  |          1  (omitted)                  
                                     1 1  |          1  (omitted)                       
                      ocpfm_dum3#livewothr_c |                  
                                     0 1  |   .8574187   1.583671    -0.08   0.934     .0229617    32.01703                  
                                     1 0  |          1  (omitted)                  
                                     1 1  |          1  (omitted)                      
                                  _cons |   .5286825   .9092787    -0.37   0.711     .0181642    15.38767
                      ----------------------------------------------------------------------------------------
                      Note: _cons estimates baseline odds.
                      Last edited by Vincent Li; 29 Sep 2021, 06:27.

                      Comment


                      • #12
                        Vincent:
                        Code:
                        note: 1.ocpfm_dum2#0.livewothr_c omitted because of collinearity *omitted because=ocpfm_dum2*
                        note: 1.ocpfm_dum2#1.livewothr_c omitted because of collinearity *omitted because=ocpfm_dum3*
                        note: 1.ocpfm_dum3#0.livewothr_c omitted because of collinearity *omitted to avoid the so called dummy trap (
                        https://en.wikipedia.org/wiki/Dummy_variable_(statistics)
                        )* note: 1.ocpfm_dum3#1.livewothr_c omitted because of collinearity *omitted to avoid the so called dummy trap (
                        https://en.wikipedia.org/wiki/Dummy_variable_(statistics)
                        )*
                        See also -margins- and -marginsplot- entries in Stata .pdf manual.
                        Kind regards,
                        Carlo
                        (Stata 18.0 SE)

                        Comment


                        • #13
                          Carlo,

                          Thank you so much.

                          Comment

                          Working...
                          X