Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why including variable square in the regression change the sign of main variable

    Dear Stata users,

    I am estimating a simultaneous equation model of two equations using _cmp_ command.

    Code:
    cmp (rinlbrf = rshlt_#  age rabplace  i.seperated##i.kid18_  i.married##i.kid18_ lesshighschool college collegemore whitecollar2_ bluecollar wealth midwest northeast west hispanic black rsmoke rdrink|| hhidpn:) (rshlt_ =rinlbrf#  age rabplace married seperated  hispanic black midwest northeast west lesshighschool college collegemore whitecollar2_ bluecollar wealth rsmoke rdrink rconde || hhidpn:) if (sex==1 & age<=65) ,ind($cmp_probit $cmp_oprobit) nolr
    where
    rshlt is health categorical variable, and rinlbrf stands for participation in the labor market

    The result shows that the effect of age on both health and labor participation is negative as expected. However, when "age2" variable [age2=age*age] is included in the regression, the age effect is positive which doesn't make sense. why?

    Thanks,
    Maryam

    Last edited by Maryam Bidgoli; 28 Jul 2016, 11:28.

  • #2
    You have a misunderstanding.

    It does not make sense to think about the effect of changing age alone: you cannot increase age without increasing age2. I would guess that if you calculate the predictions for health and labor participation for two different values of age (with age2 set to age2) in the range of your data, holding all the variables other that age and age2 constant, you will see a negative effect on both for the larger value of age.

    Without being familiar with the cmp command, I will say that in general it is not a good idea to create a separate age2 variable as you suggest you have done. Instead, replace each occurence of
    Code:
    age age2
    in your cmp command with
    Code:
    c.age##c.age
    See help factor variables for an explanation of that syntax. The cmp command apparently supports the margins postestimation command (with the reduced form estimates), which can help you better explore the marginal effects of your independent variables, but only if the model is constructed in a way that doesn't hide the dependency of age2 on age from the model by using a second variable.

    Comment

    Working...
    X