Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Svy Command and incorrect coefficient

    Hi,
    I am using two waves of DHS survey and I need to run a weighted regression of education on age and a dummy variable z. I have set up the weights as follows:

    Code:
    egen N=total(v005)
    egen n=total(v005), by(s_year)
    * In this case I had 2 surveys
    gen wtr=v005*(N/2)/n
    egen psu=group(s_year v021)
    egen stratar=group(s_year v022)
    svyset psu [pw=wtr], strata(stratar) singleunit(centered)
    After that, when I run the regression I get incorrect coefficient on dummy variable. The educ variable range from 0 to 20. However, the coefficient on dummy variable is not plausible. Any help to resolve this issue would be much appreciated.
    Code:
    svy: regress educ z yob z_yob
    where educ is years of education. z is treatment dummy, and z_yob is the interaction of z and year of birth. The coefficients that I am getting are

    Beta_z = -264.59 both the sign and magnitude is implausible here where the correct coefficient should be around 1.34.
    Beta_yob = 0.05 which is correct (replication)
    Beta z_yob = 0.133 which is also correct (replication)
    constant = incorrect but close to correct value.

  • #2
    There is nothing implausible about that value of Beta_z. You are probably misinterpreting it. Because you have an interaction model, the coefficient Beta_z is not the marginal effect of z on educ. Rather, it is the marginal effect of z on educ for those people with yob = 0. Since there is probably nobody in your data set born in year 0 (if there is, I would love to hear how you acquired such data from antiquity) this is a meaningless result. The actual marginal effect for a person whose year of birth is Y would be -264.59 + 0.133*Y. If you replace Y by the average year of birth in your data, you will probably come out with something close to the 1.34 you were expecting. For example, the marginal effect of z on educ for a person born in year 2000 is -264.59 + 0.133*2000 = 1.41.

    Another way to work with this data would be to replace the yob variable with a variable that is centered at the mean:

    Code:
    summ yob
    gen yob_c = yob - r(mean)
    
    svy: regress educ i.z##c.yob_c
    That way the coefficient of yob_c will represent the marginal effect of z on educ for a person born in the average year--which will be far more meaningful.

    To better understand interaction models, how they work, and how to interpret them, I recommend the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats2/l53.pdf. While you're at his website, I recommend you browse around--he has lots of really well-written, easily understandable class notes there on a variety of common statistical topics.
    Last edited by Clyde Schechter; 03 Nov 2021, 22:56.

    Comment


    • #3
      Thank you everyone. This problem is resolved. Actually, I interacted treatment with year of birth which instead should be running variable difference in number of years of year of birth from cutoff year.

      Comment

      Working...
      X