Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • interaction effects LPM vs logistic

    Hi!

    I am trying to understand why my interaction term get different signs when using an logistic and LPM model. It is negative with the LPM which I think (when plotting the data and the marginal effects and the interaction per value) is correct. But the logistic model demonstrate a positive non-significant interaction.
    I am interacting working class parents with years of education.
    The lpm gives a negative interaction on -.015 And the logistic shows 1.04

    When plotting marginal effects, per year of education, it demonstrates a negative influence negative at almost all values, especially low ones. has it something to do with high/low values and the interaction?

    I came across this:
    https://www.tau.ac.il/~yoavgn/files/...s_logistic.pdf
    which says: in the probability of y is higher when x is low than when x is high.This corresponds to a negative interaction in a linear probability model. when the data a represented in terms of the log odds, Because the log of the odds is the dependent variable in the logistic model, this corresponds to a positive interaction in a logistic regression.
    After Reading this I still dont understand fully what the log ods has to do with it?
    Does it have to do with weaker interaction effect at higher levels? someone who can help me explain with simpler words? (pic of interaction per year of education using logistic)
    Click image for larger version

Name:	inter2.jpg
Views:	1
Size:	89.2 KB
ID:	1516312


  • #2
    Karin, the graphic you inserted is very hard to read. It is very blurry. I can't even tell where this OR of 1.04 is coming from. I don't know what the variables and their categories are or how the graphic corresponds to your question.

    Look at the Statalist FAQ, esp pt 12, on asking questions effectively. Pay particular attention to the use of code tags.

    Then repost, showing both the commands and output for the lpm and logistic analyses you ran.

    You say "the logistic model demonstrate a positive non-significant interaction." Since non-significant, that means negative values fall within the confidence interval. Therefore there isn't necessarily any inconsistency between the lpm and logistic. Also you don't indicate whether the lpm negative interaction is statistically significant or not, which further makes it difficult to tell if there even is an inconsistency.

    One of the arguments against the LPM is that significance tests may not be right.

    Personally, I agree with Paul Allison, who basically says you can get the best of both worlds by using logistic regression and the margins command.

    https://statisticalhorizons.com/in-d...f-logit-part-2
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    StataNow Version: 19.5 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Richard,

      thank you for your response. The 1.04 is when education is continuous in the logistic model. the graphic is just to demonstrate that at the negative influence of working class-origin is lower at higher levels of education as I was thinking that this miht change my results. The LPM is significant and have robust results that’s why it came across it as strange with the difference between the models and I was trying to find an answer to that.

      the marginal effects of "arbetarklassbakgrund=1" looks negative to me in the image, but at very high level it is positive (see next post)
      Click image for larger version

Name:	Graph.png
Views:	1
Size:	64.7 KB
ID:	1516383

      Last edited by karin kristensson; 15 Sep 2019, 02:14.

      Comment


      • #4
        when plotted including high levels of education
        Click image for larger version

Name:	margins2.png
Views:	1
Size:	69.3 KB
ID:	1516386
        Attached Files

        Comment


        • #5
          You have not done what Richard asked you to do. Without that it is difficult to help you:

          "Look at the Statalist FAQ, esp pt 12, on asking questions effectively. Pay particular attention to the use of code tags.
          Then repost, showing both the commands and output for the lpm and logistic analyses you ran."

          Comment


          • #6
            I am sorry, I will try to get i right:

            the code I used was the same except for changing from regress to logistic

            THE LPM:
            code:
            gen utbarXarb=utbildningsår*arbetarklassbakgrund
            regress högretjänsteman utbarXarb arbetarklassbakgrund utbildningsår ålder ålde2 kvinna andragen arbetslivserfarenhet tvatusen, robust


            Output:
            . regress högretjänsteman utbarXarb arbetarklassbakgrund utbildningsår ålder kvinna andragen arbets
            > livserfarenhet arbetslivserfarenhet2 tvatusen, robust
            Linear regression Number of obs = 4,735
            F(9, 4725) = 86.45
            Prob > F = 0.0000
            R-squared = 0.1802
            Root MSE = .33236
            Robust
            högretjänsteman Coef. Std. Err. t P>t [95% Conf. Interval]
            utbarXarb -.0141404 .0035221 -4.01 0.000 -.0210454 -.0072354
            arbetarklassbakgrund .1221207 .0391108 3.12 0.002 .0454454 .198796
            utbildningsår .0508958 .0022951 22.18 0.000 .0463963 .0553952
            ålder -.0000277 .0003029 -0.09 0.927 -.0006216 .0005662
            kvinna -.056733 .0097107 -5.84 0.000 -.0757706 -.0376954
            andragen -.0552507 .0214457 -2.58 0.010 -.0972942 -.0132072
            arbetslivserfarenhet .0051176 .0007365 6.95 0.000 .0036736 .0065616
            arbetslivserfarenhet2 -.0000406 .0000101 -4.02 0.000 -.0000603 -.0000208
            tvatusen .0283215 .0106192 2.67 0.008 .007503 .04914
            _cons -.5377979 .0374675 -14.35 0.000 -.6112517 -.464344
            .

            THE LOGISTIC:

            Code: logistic högretjänsteman utbarXarb arbetarklassbakgrund utbildningsår ålder kvinna andragen arbetslivserfarenhet arbetslivserfarenhet2 tvatusen, robust



            Output:
            . logistic högretjänsteman utbarXarb arbetarklassbakgrund utbildningsår ålder kvinna andragen arbet
            > slivserfarenhet arbetslivserfarenhet2 tvatusen, robust
            Logistic regression Number of obs = 4,735
            Wald chi2(9) = 576.86
            Prob > chi2 = 0.0000
            Log pseudolikelihood = -1652.2071 Pseudo R2 = 0.2066
            Robust
            högretjänsteman Odds Ratio Std. Err. z P>z [95% Conf. Interval]
            utbarXarb 1.045756 .0384651 1.22 0.224 .9730193 1.12393
            arbetarklassbakgrund .3136205 .1643819 -2.21 0.027 .1122673 .8761037
            utbildningsår 1.434982 .0289727 17.89 0.000 1.379305 1.492906
            ålder .997769 .0036922 -0.60 0.546 .9905586 1.005032
            kvinna .5874759 .0531224 -5.88 0.000 .4920625 .7013904
            andragen .5152386 .1513172 -2.26 0.024 .2897507 .9162044
            arbetslivserfarenhet 1.04599 .0097359 4.83 0.000 1.027081 1.065247
            arbetslivserfarenhet2 .9995538 .0001637 -2.73 0.006 .9992331 .9998746
            tvatusen 1.126824 .1261289 1.07 0.286 .9048535 1.403245
            _cons .0010692 .0003931 -18.61 0.000 .0005202 .0021978
            Note: _cons estimates baseline odds.

            was that more easy to nterpret? My question is regarding the interaction utbarXarb

            Comment


            • #7
              Hi Karin. This is a little better but still sub-optimal. The way you did it, when there are two or more consecutive spaces every space after the first gets deflected and the output doesn’t line up correctly. As described in pt 12 of the FAQ, you should use code tags. Some people will put in the effort to decipher hard to read output but I tend to not be one of them! Whatever you can do to convey your question clearly is to your benefit.

              it appears that you computed the interaction yourself. This is a bad idea. You should use factor variable notation. See

              help fvvarlist

              More critically, I can”t tell if you included the main effects along with the interaction term. If not you should. You may have, but whatever language you are using for your variables is not one I speak. If you used factor variable notation it would be obvious how you handled the main effects and interactions.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment

              Working...
              X