Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using margins with nonlinear predictors in hybrid model

    Hi! I am new to this forum. I hope I am posting in a correct fashion.

    I have been doing panel data analysis recently using Allison’s (2009) hybrid method. However, given that I suspect my outcome could be related to one of the predictors in a nonlinear fashion, I am not sure how to model nonlinear relationship under hybrid method properly. I have thought of two possible ways to model nonlinear relationship under hybrid method.


    Let me begin by introducing the hybrid method proposed by Allison (2009). Allison (2009:23-25) discussed the details of estimating hybrid model. The technique of estimating hybrid model is to transform time-varying x into deviations from person-specific means (which is time-varying) and person-specific means (which is time-invariant). The goal is that the coefficient estimates for the deviations from person-specific means is equivalent to fixed-effects estimates. In all my simulations, the true model is given by:

    Code:
    generate y = 3 + 3.6*age + 1.5*fincome + .6*fincome^2 + u_i + e


    Thus, fincome is the time-varying variable which has a nonlinear relationship with y.

    To implement the hybrid method, I ran the following Stata code to transform time-varying x into deviation from person-specific means and person-specific means:

    Code:
    local x fincome
    Code:
    by id: egen double M`x'=mean(`x')
    
    label var M`x' "Mean of `x'"
    
    gen C`x'=`x'-M`x'
    
    label var C`x' "Mean-centered `x'"


    In my simulation, there are five waves. Thus, each person has five observations. The second line generate the mean of fincome across five waves for each respondent (person-specific means). The fourth line generate the deviation from person-specific means for each respondent in each wave.

    After transformation, a linear random-effects model can be estimated. The coefficient estimates for the deviations from person-specific means (in my case, Cfincome) would be identical to fixed-effects model (xtreg with fe options in Stata). Assuming the relationship between fincome and y being linear, the fixed-effects model would be:

    Code:
    xtreg y c.age c.fincome,fe


    According to Allison (2009), the coefficient estimate for fincome in fixed-effects model would be the same as the coefficient estimate for Cfincome (assuming the relationship between fincome and y being linear):

    Code:
    mixed y c.age c.Mfincome c.Cfincome ||id:


    However, Allison (2009) didn’t specify how to transform x when there is suspected nonlinear relationship between x and y. Therefore, even though the above transformation works well for linear relationship, I don’t know the correct way to model nonlinear relationship between x and y. That’s the motivation of my simulations.

    Specifically, I have two options of transformation. The first option is to model nonlinear relationship as follows:

    Code:
    mixed y c.age c.Mfincome c.Cfincome##c.Cfincome ||id:


    I include the person-specific means and deviation from person-specific means and its squared term in the model. If this approach is correct, then the coefficient estimates for Cfincome and c.Cfincome##c.Cfincome would be equal to the true model estimates (1.5 and 0.6 respectively).

    But as shown in the simulation results, the mean of coefficient estimates for Cfincome and c.Cfincome##c.Cfincome are 3.55 and 1.69 respectively. Since the true model has the coefficient estimates of 1.5 and 0.6 respectively, it seems that the first way to model nonlinear relationship is far from ideal.

    Then, I tested the second option of transformation. I tried to treat the squared fincome as another variable by transforming it separately. Specifically, other than transforming the linear term of fincome, I also transformed the squared term of fincome:

    Code:
    local x fincome
    Code:
    by id: egen double M`x'=mean(`x')
    
    label var M`x' "Mean of `x'"
    
    gen C`x'=`x'-M`x'
    
    label var C`x' "Mean-centered `x'"
    
    
    
    gen `x'2=`x'*`x'
    
    by id: egen double M`x'2=mean(`x'2)
    
    label var M`x'2 "Mean of `x'2"
    
    gen C`x'2=`x'2-M`x'2
    
    label var C`x'2 "Mean-centered `x'2"


    In the second part, I first generated a squared fincome. Then I generated the person-specific means of squared fincome across five waves for each respondent. Lastly, I generated the deviation from person-specific means of squared fincome for each respondent in each wave. After that, I estimate linear random-effects model in this way:

    Code:
    mixed y c.age c.Mfincome c.Mfincome2 c.Cfincome c.Cfincome2 ||id:


    In this model, I included person-specific means of squared fincome (Mfincome2) and deviation from person-specific means of squared fincome (Cfincome2). The results showed that the mean of coefficient estimates for Cfincome and Cfincome2 are 1.50 and 0.60 respectively, which is the same as the true model. Thus I concluded that the second option of transformation yields the unbiased estimates.

    However, since the deviation from person-specific means of squared fincome (Cfincome2) is not arithmetically related to the linear term of deviation from person-specific means of fincome (Cfincome), I can’t think of any viable way to use margins. Therefore, I have been struggling in figuring out how to use margins in Stata to show the nonlinear relationship between x and y. As a result, I would like to ask if there are any ways to run margins under this condition?


    Reference:

    Allison, Paul David. 2009. Fixed Effects Regression Models. Los Angeles: SAGE.

  • #2
    Originally posted by Pui Yin Cheung View Post
    ...
    Code:
    mixed y c.age c.Mfincome c.Mfincome2 c.Cfincome c.Cfincome2 ||id:


    In this model, I included person-specific means of squared fincome (Mfincome2) and deviation from person-specific means of squared fincome (Cfincome2). The results showed that the mean of coefficient estimates for Cfincome and Cfincome2 are 1.50 and 0.60 respectively, which is the same as the true model. Thus I concluded that the second option of transformation yields the unbiased estimates.

    However, since the deviation from person-specific means of squared fincome (Cfincome2) is not arithmetically related to the linear term of deviation from person-specific means of fincome (Cfincome), I can’t think of any viable way to use margins. Therefore, I have been struggling in figuring out how to use margins in Stata to show the nonlinear relationship between x and y. As a result, I would like to ask if there are any ways to run margins under this condition?


    Reference:

    Allison, Paul David. 2009. Fixed Effects Regression Models. Los Angeles: SAGE.
    Pui Yin,

    May I ask why you didn't use factor variable syntax in that equation to generate both the linear and squared terms for MFincome and CFincome in your last model? You already did so in a prior model, so you know that syntax. However, just on the off chance you didn't know this, when you use the factor variable syntax to indicate squared terms, those terms get incorporated in your margins command. Margins will know that when you ask for, say, the marginal effect of a 1-unit change in centered income, that it has to calculate it with both the linear and squared terms.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #3
      Hi Weiwen,
      Based on my simulation, the issue is that factor variable syntax (i.e. c.Cfincome##c.Cfincome) yield biased estimates of within-effects when I put them into mixed-effects model. Thus I need to transform the raw variable into both linear and squared forms. Specifically, the code to transform Cfincome is:
      Code:
      by id: egen double Mfincome=mean(fincome)
      gen Cfincome=fincome-Mfincome
      To transform to Cfincome2, I need to first transform the original variable into its squared form:
      Code:
      gen fincome2=fincome*fincome
      Then, I can transform it as if it is "just-another-variable"
      Code:
      by id: egen double Mfincome2=mean(fincome2)
      gen Cfincome2=fincome2-Mfincome2
      Based on my simulation, including Mfincome, Mfincome2, Cfincome, and Cfincome2 into mixed-effects model yield unbiased within-effect estimates. However, two pairs of predictors (i.e. Cfincome and Cfincome2, Mfincome and Mfincome2) would then be arithmetically unrelated. Hence, using factor variable syntax cannot produce unbiased estimates.

      In sum, the current issue is that if I use factor variable syntax (c.Cfincome##c.Cfincome), the estimates are biased but I can estimate marginal effects easily. On the other hand, if I use the correct way of modeling (I call it "just-another-variable" approach), I cannot use margins. I hope this helps me to clarify the issue. Thanks.

      Comment

      Working...
      X