Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why we need to use fixed effect for binary variable?

    When trying to explain the interaction variable explanation, I saw this link.
    In this link, I saw a paragraph

    GPA is the student’s Grade Point Average (higher values indicate better grades). The average gpa is about 2.81. The range of gpa theoretically goes from 0 to 4 but in actuality the lowest gpa in the sample is 1.45. MALE is coded 1 if the student is male, 0 if Female. MALEGPA = MALE * GPA
    Click image for larger version

Name:	1.PNG
Views:	1
Size:	56.2 KB
ID:	1620637



    I do not understandwhy they do not do
    Code:
    reg drink gpa male
    but
    Code:
    reg drink gpa i.male

    And why they say that the regression in the picture above "The model does not allow for the effects of GPA to differ by gender"?
    Last edited by Phuc Nguyen; 25 Jul 2021, 15:26.

  • #2
    If you are only using -regress-, and the variable male is dichotomous, then it makes no difference whether you use male or i.male in the list of variables.

    However, they may be planning on using the -margins- command later, and in that case failure to use the i. prefix may result in incorrect results. Also if the variable is categorical with more than 2 levels, the use of male in the -regress- command would give incorrect results there as well. Since both of these situations are fairly common, some of us have gotten into the habit of always using the i. prefix for categorical variables in regression commands. It never does any harm, whereas omitting it can lead to errors in the wrong circumstances.

    The model given by those codes is drink = constant + b1 * gpa + b2*(male == 1) + error. Clearly, there is no interaction term specified in those codes, and the one and only effect of gpa reflected in that model is the single coefficient of gpa, b1. If they wanted to do a model in which the effect of gpa varies by sex, they would have to include an interaction term:
    Code:
    regress drink i.male##c.gpa
    Now, they do mention a variable MALEGPA = male * gpa--this would be another, inferior, way to represent an interaction. (Inferior, because -margins- would not work with it.) But since they do not actually use it in the regression command, it has no impact on the results. It isn't even clear why they mention it at all.

    Comment


    • #3
      Scrolling down further in the liinked document, a subsequent regression is run that uses MALEGPA, although in fact the author of the linked document included the interaction using the # factor variable operator, rather than follow what I am assuming was done in the original source that the author has adapted.

      Comment


      • #4
        This is my handout you are discussing. ;-) I suspect some of the notation reflects the fact that this handout began life before factor variable notation existed, and never got 100% cleaned up. Since I last taught the class in 2015 it may never get cleaned up. But hopefully it is still pretty intelligible.

        i wrote the handout because students would say the effect of gender was positive, but when you add an interaction the effect of gender becomes negative, and then they would try to come up with some sort of convoluted explanation involving suppressor effects or whatever. The handout explains that once you add interactions, the interpretation of the main effects changes.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          I don’t have Stata handy now, but I notice the handout has commands like

          regress drink gpa male i.male#c.gpa

          That is basically treating male both as a continuous variable and as a categorical variable. I should be using i.male throughout. In fact, I am a little surprised Stata didn’t give me an error or warning when I later ran margins commands. Maybe a newer version would.

          so, I probably will break down and clean up that handout a bit!
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          StataNow Version: 19.5 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            That is basically treating male both as a continuous variable and as a categorical variable. I should be using i.male throughout. In fact, I am a little surprised Stata didn’t give me an error or warning when I later ran margins commands. Maybe a newer version would.
            In Stata 17.0, margins performs correctly despite the conflict (that is, gives the same answers as when it is run after correctly specifying male as i.male), but marginsplot then throws an error.
            Code:
            . margins male, at(gpa=(0(.5)4))
            
            Predictive margins                                         Number of obs = 218
            Model VCE: OLS
            
            Expression: Linear prediction, predict()
            1._at: gpa =   0
            2._at: gpa =  .5
            3._at: gpa =   1
            4._at: gpa = 1.5
            5._at: gpa =   2
            6._at: gpa = 2.5
            7._at: gpa =   3
            8._at: gpa = 3.5
            9._at: gpa =   4
            
            ------------------------------------------------------------------------------
                         |            Delta-method
                         |     Margin   std. err.      t    P>|t|     [95% conf. interval]
            -------------+----------------------------------------------------------------
                _at#male |
               1#Female  |   28.52206   3.739645     7.63   0.000     21.15081    35.89332
                 1#Male  |   28.67088   3.823221     7.50   0.000     21.13488    36.20687
               2#Female  |   26.51646    3.10792     8.53   0.000      20.3904    32.64252
                 2#Male  |   27.27131   3.141259     8.68   0.000     21.07954    33.46308
               3#Female  |   24.51085     2.4809     9.88   0.000     19.62073    29.40098
                 3#Male  |   25.87174   2.465748    10.49   0.000     21.01147      30.732
               4#Female  |   22.50525   1.863339    12.08   0.000      18.8324     26.1781
                 4#Male  |   24.47217   1.803947    13.57   0.000     20.91639    28.02795
               5#Female  |   20.49965   1.269123    16.15   0.000     17.99806    23.00123
                 5#Male  |    23.0726   1.179173    19.57   0.000     20.74831    25.39688
               6#Female  |   18.49404   .7555032    24.48   0.000     17.00486    19.98322
                 6#Male  |   21.67302   .6989884    31.01   0.000     20.29524    23.05081
               7#Female  |   16.48844   .5936077    27.78   0.000     15.31837     17.6585
                 7#Male  |   20.27345   .7406966    27.37   0.000     18.81346    21.73345
               8#Female  |   14.48283   .9774596    14.82   0.000     12.55615    16.40951
                 8#Male  |   18.87388   1.253232    15.06   0.000     16.40362    21.34414
               9#Female  |   12.47723   1.542711     8.09   0.000     9.436372    15.51808
                 9#Male  |   17.47431   1.885327     9.27   0.000     13.75812     21.1905
            ------------------------------------------------------------------------------
            
            . marginsplot, scheme(sj) noci ytitle(Predicted Drinking Score) name(intgpa)
            invalid at() dimension information;
            using variable male as a factor variable and a regular variable is not supported
            r(322);

            Comment


            • #7
              William Lisowski , good catch. I've tidied up the handout so future generations will hopefully be ok.

              Incidentally, I got the impression Phuc Nguyen was confused by the factor variable notation. It is true that I use it but do not explain it. But the handout is not meant to be stand-alone. There are dozens of handouts both before it and after it! Anybody taking the class with me would presumably understand the handout, but people who just happen to come across one of my handouts in isolation from others may have to do some other reading first.
              -------------------------------------------
              Richard Williams, Notre Dame Dept of Sociology
              StataNow Version: 19.5 MP (2 processor)

              EMAIL: [email protected]
              WWW: https://www3.nd.edu/~rwilliam

              Comment

              Working...
              X