Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Main effects and/versus Interactions

    Hi all,

    I have four vars:
    outcome coded as 0 "no" 1 "yes"
    race coded as 1 "asian" 2 "latin" 3 "black" 4 "white"
    region_res coded as 1 "South" 2 "West" 3 "Midwest" 4 "Northeast"
    rural coded as 0 "no" 1 "yes"
    The dataset is coplex survey data; thus, I svyset and ran logistic regressions with region-race interactions as follows.

    Code:
    svy: logit outcome ib3.race##ib3.region_res i.rural, or
    Linearized
    outcome Odds Ratio P>t
    race
    1 0.50
    2 0.24 ***
    4 0.34 ***
    region_res
    1 0.30 *
    2 1.67 *
    4 0.43 ***
    race#region_res
    1 1 4.90
    1 2 0.22
    1 4 0.73
    2 1 10.26 **
    2 2 2.43
    2 4 2.15 *
    4 1 4.66
    4 2 1.58
    4 4 5.00 ***
    1.rural 0.06 ***
    _cons 14.52 ***

    However, I then thought that residing in a specific region and having a specific racial background are discrete characteristics for each observation that I could not get the whole point of including the main effects of region and race in the regression analysis. So, I excluded the main effects got the following output.

    Code:
    svy: logit outcome ib3.race#ib3.region_res i.rural, or
    Linearized
    outcome Odds Ratio P>t
    race#region_res
    1 1 0.74
    1 2 0.19 ***
    1 3 0.50
    1 4 0.16 ***
    2 1 0.74
    2 2 0.96
    2 3 0.24 ***
    2 4 0.22 ***
    3 1 0.30 *
    3 2 1.67
    3 4 0.43 ***
    4 1 0.49
    4 2 0.91
    4 3 0.34 ***
    4 4 0.73
    1.rural 0.06 ***
    _cons 14.52 ***
    MY QUESTION:
    Am I right with my second point, 'partial factorial'? I'm just wondering if this may affect the analysis.

    Thank you!

    I use Stata15.1/SE version
    Last edited by Messu Melu; 27 Apr 2019, 16:34.

  • #2
    You can do this either way. You just have to be sure you know how to interpret whichever way you do. I would say that the best way to interpret these results is to use the -margins- command, as it will give you the right results either way and you don't have to worry that you've done it wrong.

    Comment


    • #3
      Thank you very much Clyde, for you fast response! This is very helpful! Sure, I will use margins to interpret. However, I could not find any examplary literature for interpreting two interacting variables with four categories each. Am I right if I just compare each interaction with reference groups the way I can do for two interacting binary variables?
      Last edited by Messu Melu; 27 Apr 2019, 16:52.

      Comment


      • #4
        Yes. When you have variables with multiple levels and you use the single # interaction operator, it is as if you had a single variable representing all 16 combinations of race and region and then entered that into the model as i.combination_race_region_variable. One category will be omitted as the reference, and each of the other categories gives an odds ratio relative to that omitted category.

        If you follow that up with -margins race#region- you will get output for all 16 combinations showing their predicted probabilities--which for most people is probably easier to grasp than a series of odds ratios.

        Comment


        • #5
          Thank you very much once again, dear Clyde!

          Comment


          • #6
            I found these UCLA annotations helpful as well.

            https://stats.idre.ucla.edu/stata/fa...n-interaction/

            Comment

            Working...
            X