Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction term insignificance & Marginsplot problem

    Hi all,

    My initial pooled OLS estimation yields results with statistically significant coefficients.

    I ran a correlation matrix which assigns a value of 0.4 to my explanatory variables, education and income. This indicates a moderate, positive relationship and so I decided to include an interaction term (education*income) in my regression as follows:

    Code:
    reg cashshare age c.incometh##i.educat male credit cheque rating holdings i.year if sample==1, robust
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(newID year cashshare) double age float(incometh educat male) double(credit cheque) float rating double holdings float sample
     1 2015   .1334569 31 112.5 4 1 1 1 20 108.33333333333341 1
     1 2016   .3030303 32 112.5 4 1 1 1 21                 20 1
     1 2017   .1935484 34 112.5 4 1 1 1 24  969.6428580000008 1
     2 2015  .12854996 66  27.5 4 0 1 1 21  300.0000000000001 1
     2 2016   .1992903 67  17.5 4 0 1 1 22 180.00000000000006 1
     2 2017   .1682243 68  17.5 4 0 1 1 23                280 1
     3 2016 .020833334 41 112.5 3 1 1 1 29                  0 1
     3 2017          1 42 112.5 3 1 1 1 25                 80 1
     4 2015  .05921588 25  32.5 2 1 0 1 25               82.5 1
     4 2016          0 26  37.5 2 1 0 1 26 11.666666666666664 1
     4 2017          0 27  37.5 2 1 1 1 24  973.9285715999991 1
     5 2015   .3259842 53  37.5 3 0 1 1 21 130.44642870000004 1
     5 2016  .20155144 55  32.5 3 0 1 1 23  80.00000000000001 1
     5 2017   .4016003 56  22.5 3 0 1 1 22  60.00000000000001 1
     6 2015  .06490872 26    55 4 0 1 1 20                 80 1
     6 2016     .09375 28    55 4 0 1 1 20                 20 1
     7 2015  .05271691 83 112.5 3 1 1 1 22 1304.4642869999998 1
     7 2016  .05449017 84 112.5 3 1 1 1 18                600 1
     7 2017  .24793923 85 112.5 3 1 1 0 20                300 1
     8 2015          0 38 112.5 3 1 1 1 14  83.33333333333327 1
     8 2016  .13636364 40 162.5 3 1 1 1 17  85.73735572900041 1
     8 2017          0 41 162.5 3 1 1 1 15 199.99999999999994 1
     9 2015  .53912795 57  22.5 2 0 1 1 29                 80 1
     9 2016  .12972517 58  22.5 2 0 1 1 28 200.00000000000014 1
     9 2017  .54251766 59  22.5 2 0 1 1 29 13.333333333333336 1
    10 2015 .023809524 57 162.5 4 1 1 1 22  869.6428579999998 1
    10 2016  .29116118 58    55 4 1 1 1 23  710.3417382269064 1
    10 2017  .13953489 59    45 4 1 1 1 21  869.6428579999998 1
    11 2015  .52614975 44  87.5 3 1 1 1 23 434.82142900000036 1
    11 2016   .7837778 46  87.5 3 1 1 1 23 434.82142900000036 1
    11 2017   .8249276 47  87.5 3 1 1 1 23 500.00000000000045 1
    12 2015  .12204076 54 162.5 3 1 1 1 21  150.0000000000001 1
    12 2016          0 56 162.5 3 1 1 1 22                 40 1
    12 2017   .0961064 57 162.5 3 1 1 1 24  60.00000000000001 1
    13 2015  .06973366 64  17.5 3 0 1 1 16 100.00000000000007 1
    13 2016   .1854961 66  17.5 3 0 1 1 22   708.333333333333 1
    13 2017   .0924408 67 11.25 3 0 1 1 19                300 1
    14 2015  .50914204 48  6.25 3 0 0 0 24        2174.107145 1
    14 2016   .3966907 49  8.75 3 0 0 0 30 1779.2857159999999 1
    14 2017   .8424754 50  8.75 3 0 0 1 24  521.7857148000004 1
    15 2015  .24536224 54 112.5 3 0 1 1 24  257.4107145000001 1
    15 2017          . 57 112.5 3 0 1 1 16                  . 0
    16 2015  .05657994 56 162.5 3 1 1 1 19 200.00000000000014 1
    16 2016   .2283169 58 162.5 3 1 1 1 22 100.00000000000006 1
    16 2017  .14577565 58 162.5 3 1 1 1 20 100.00000000000007 1
    17 2015   .2158688 53  67.5 2 1 1 1 25 373.92857160000017 1
    17 2016  .53102005 55  67.5 2 1 0 1 19 240.00000000000014 1
    18 2015  .03986711 47 162.5 3 0 1 1 19  33.33333333333334 1
    18 2017   .0815647 50 162.5 3 0 1 1 14 100.00000000000007 1
    19 2015  .06666667 49  67.5 4 1 1 1 21 23.333333333333336 1
    19 2016  .03590127 51  67.5 4 1 1 1 19                 40 1
    19 2017  .07803112 52  67.5 4 1 1 1 26                 80 1
    20 2015   .1700716 62  87.5 2 1 1 1 24  280.8928574000001 1
    20 2016  .17845364 64 112.5 2 1 1 1 24 180.00000000000006 1
    20 2017  .28082514 64 112.5 2 1 1 1 23 146.96428580000003 1
    21 2015  .20849185 64  8.75 3 0 1 1 16 173.92857160000003 1
    21 2016  .46384865 65  8.75 3 0 1 1 23  95.29761913333337 1
    21 2017   .3653846 66  8.75 3 0 1 1 17                120 1
    22 2016   .6631991 50  32.5 2 1 1 1 25  782.6785722000002 1
    22 2017   .6666667 51  32.5 2 1 1 1 22  360.0000000000001 1
    23 2015  .04206984 46 162.5 4 1 1 1 19                 20 1
    23 2017          0 49 162.5 4 1 1 1 20                 80 1
    24 2015   .1640541 44  87.5 3 1 1 1 20               1600 1
    24 2016   .3001541 45 112.5 3 1 1 1 20 120.00000000000001 1
    24 2017        .25 46 112.5 3 1 1 1 16 100.00000000000007 1
    25 2015        .12 28   2.5 4 0 1 1 16                260 1
    25 2016          0 29  17.5 4 0 1 1 21                 80 1
    25 2017 .069695085 30  27.5 4 0 1 1 17 46.666666666666664 1
    26 2015        .18 30    45 4 0 1 1 17 100.00000000000007 1
    26 2016  .06896552 32    45 4 0 1 1 25                 60 1
    26 2017  .14180991 32    45 4 0 1 1 23 100.00000000000007 1
    27 2015  .23148148 52  67.5 4 1 1 1 20 200.00000000000014 1
    27 2016   .2897196 52  67.5 4 1 1 1 21  360.0000000000003 1
    27 2017  .20763187 53  67.5 4 1 1 1 19  340.0000000000003 1
    28 2015  .09425198 46 162.5 3 1 1 1 17  554.8214290000002 1
    28 2016  .06666667 47 112.5 3 1 1 1 22 180.00000000000006 1
    28 2017   .4494983 48 112.5 3 1 1 1 24 120.00000000000001 1
    29 2015          0 31  67.5 3 1 1 1 20                 20 1
    29 2016  .12244898 33  67.5 3 1 1 1 22                160 1
    29 2017       .125 34  67.5 3 1 1 1 20                 40 1
    30 2015   .3865514 56  67.5 2 1 0 1 29 3892.8854616688204 1
    30 2016  .25685653 58  67.5 2 1 0 1 19 126.96428580000003 1
    30 2017         .2 59  87.5 2 1 0 1 19 213.92857160000003 1
    31 2015   .3388633 58  32.5 2 1 0 1 25 195.66964305000005 1
    31 2016   .4044944 60  32.5 2 1 0 1 22 200.00000000000014 1
    31 2017   .4013378 61  32.5 2 1 0 1 22 200.00000000000003 1
    32 2015   .1178344 30 112.5 3 1 1 1 21  33.33333333333334 1
    32 2016 .012800976 32 112.5 3 1 1 1 21 25.000000000000018 1
    32 2017  .01222494 33 112.5 3 1 1 1 22   41.6666666666667 1
    33 2015    .522196 59  67.5 2 1 0 1 15 1739.2857159999999 1
    33 2016  .52024233 61  67.5 2 1 0 1 23  521.7857148000002 1
    34 2015          . 69  8.75 1 1 . .  .                  . 0
    34 2016          1 69  8.75 1 1 0 0 12 130.44642870000007 1
    34 2017          1 71  8.75 1 1 0 1 17 173.92857160000003 1
    35 2015          0 53  87.5 3 1 1 1 25  53.33333333333333 1
    35 2016   .1923077 54  67.5 3 1 0 1 21                 20 1
    35 2017  .06060606 55  87.5 3 1 0 1 26  33.33333333333333 1
    36 2015   .4244186 37 13.75 3 0 0 1 23 180.00000000000003 1
    36 2016  .17261343 39  8.75 3 0 0 1 23 213.92857160000003 1
    36 2017  .15873533 40 11.25 3 0 0 1 21 173.92857160000003 1
    end
    label values newID newID
    label def newID 1 "140100007", modify
    label def newID 2 "140100010", modify
    label def newID 3 "140100035", modify
    label def newID 4 "140100038", modify
    label def newID 5 "140100047", modify
    label def newID 6 "140100048", modify
    label def newID 7 "140100055", modify
    label def newID 8 "140100072", modify
    label def newID 9 "140100081", modify
    label def newID 10 "140100108", modify
    label def newID 11 "140100116", modify
    label def newID 12 "140100125", modify
    label def newID 13 "140100143", modify
    label def newID 14 "140100144", modify
    label def newID 15 "140100160", modify
    label def newID 16 "140100168", modify
    label def newID 17 "140100175", modify
    label def newID 18 "140100179", modify
    label def newID 19 "140100183", modify
    label def newID 20 "140100236", modify
    label def newID 21 "140100244", modify
    label def newID 22 "140100288", modify
    label def newID 23 "140100295", modify
    label def newID 24 "140100299", modify
    label def newID 25 "140100300", modify
    label def newID 26 "140100307", modify
    label def newID 27 "140100310", modify
    label def newID 28 "140100317", modify
    label def newID 29 "140100324", modify
    label def newID 30 "140100329", modify
    label def newID 31 "140100333", modify
    label def newID 32 "140100335", modify
    label def newID 33 "140100341", modify
    label def newID 34 "140100346", modify
    label def newID 35 "140100378", modify
    label def newID 36 "140100414", modify
    label values educat educat_label
    label def educat_label 1 "no diploma", modify
    label def educat_label 2 "high school", modify
    label def educat_label 3 "graduate", modify
    label def educat_label 4 "post graduate", modify
    label values male male_label
    label def male_label 0 "female", modify
    label def male_label 1 "male", modify
    label values credit credit_label
    label def credit_label 0 "no credit card", modify
    label def credit_label 1 "credit card owner", modify
    lab var cashshare "cash share of total trasanactions in a typical month"
    lab var age "age"
    lab var incometh "income measured in thousands of dollars"
    lab var educat "highlest level of education"
    lab var male "is a male"
    lab var credit "owns credit card"
    lab var cheque "owns checking account"
    lab var rating "total rating of cash out of 30"
    lab var holdings "total cash held in a typical month”
    When I ran the OLS estimation with the interaction term included, the coefficient on income became insignificant and the coefficient on the interaction term was insignificant too.

    (1) I would like to understand the reasoning for this and any ideas would be really helpful. Also, I am thinking of removing the interaction term from my model because beforehand all coefficients were significant but I am still unsure about this.

    (2) I would also like to understand more about the relationship between education and income through margins and marginsplot but I am unsure of the correct code. I began with the following code which gave me error message “only factor variables and their interactions are allowed r(198);”

    Code:
    margins incometh educat
    Many thanks in advance.
    Last edited by sladmin; 11 May 2020, 07:59. Reason: anonymize original poster

  • #2
    Your reasoning was bad from the start. The fact that education and income had a high correlation is not a reason to include an interaction term. The reason to add an interaction term in a model is to give the model the flexibility to allow the effect of either of the two variables to depend on the value of the other. You add an interaction term when there is reason to believe that either variable's effect on the outcome depends on the value of the other. This has nothing at all to do with whether the variables themselves are correlated.

    Next, you should never select your model based on whether it gives "significant" results that you would like to see. That is not science. It's an egregious form of statistical malpractice that goes by several names, among them, p-hacking, data dredging, and noise mining. Moreover it is viewed by some as scientific misconduct if done knowing that it is wrong. You should remove or retain the interaction term based on whether the interaction itself is a meaningful contributor to the results. One way to do that would be to look at how much R2 differs between the two models.

    Next, the fact that the significance findings changed when you added the interaction term means absolutely nothing at all. In a model with an interaction term education#income, the coefficient of income no longer is the effect of education. It is the effect of education when income is zero. Similarly the coefficient of income in that model is the effect of income when education is zero. So these are not comparable to the coefficients of income and education in a non-interaction model, where they do represent the overall effects of income and education, respectively. So comparing them across the two different models is not an apples-to-apples comparison and there is no reason to expect any aspect of their results to be the same, or even vaguely similar. They can change in drastically different ways, particularly if there really is an interaction effect!

    The -margins- command you show will give you an error message. Because incometh is a continuous variable, it cannot be used in that way. What you need to do is pick some values of incometh that is representative of the range of interesting values. To illustrate the code, I'll assume that 10000 50000 100000 is a set of interesting values. Then you can do:
    Code:
    margins educat, at(incometh = (10000 50000 100000))
    marginsplot
    Then you will get a graph with 4 lines, one for each value of education, each showing the expected value of cashshare as a function of income for that educational group.

    Comment


    • #3
      Thanks Clyde - I’m still getting to grips with STATA and conducting regressions. I appreciate your response and patience.

      Comment

      Working...
      X