Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimating a beta coefficient only for a subset of data

    Hi all. I am trying to estimate a regression that looks like this:

    Code:
    use "https://www.stata-press.com/data/r17/nlswork.dta"
    reg ln_wage tenure hours wks_work ttl_exp
    Let's say that I want to estimate the beta coefficients for each variable on the right-hand side of the equation for the entire sample except for tenure, for which I want Stata to compute the coefficient only on the subset of people that are married, that is, married==1 in this dataset. In other words, I want the coefficients for hours, wks_work, and ttl_exp to be estimated on the entire sample, while the coefficient for tenure is estimated only on the subsample of people that are married.

    How do I go about it? I do not know what to do. Simply doing c.tenure#i.married does not work of course.

    Any help is appreciated. Thank you!

  • #2
    Not clear what you want, but you can use
    reg ln_wage tenure hours wks_work ttl_exp if msp==1

    Comment


    • #3
      Hi Tiago. There are about 28,000 observations in the sample, 17,000 of which are married people. If I implement your line of code, the entire regression is estimated only for the 17,000 people who are married but that is not what I want. I want the beta coefficients for hours, wks_work, and ttl_exp to be estimated on all the 28,000 people in the sample, while at the same time I want the beta coefficient for tenure to be estimated only on the 17,000 people that are married.

      Basically, I need my regression to look something like this:

      ln_wage = beta1 * tenure (if msp==1) + beta2 * hours + beta3 * wks_work + beta4 * ttl_exp

      I hope this is clear, sorry if not.

      Comment


      • #4
        I see. So this is not a Stata related issue.
        Last edited by Tiago Pereira; 13 Feb 2022, 09:24.

        Comment


        • #5
          Federico:
          you can't, as Stata internal reasoning is based on matrices, that cannot be shorthened for a set of variables only.
          Therefore:
          1) you can run:
          Code:
          use "https://www.stata-press.com/data/r17/nlswork.dta"
          reg ln_wage tenure hours wks_work ttl_exp
          considering all the observations included in the dataset (exception made for those with missing values in any variable, that Stata omit by default via casewise deletion);
          2) the other approach is to limit your regression on a given subsample of the original dataset via an -if- clause:
          Code:
          use "https://www.stata-press.com/data/r17/nlswork.dta"
          reg ln_wage tenure hours wks_work ttl_exp if msp==1
          (the previous comment about casewise deletion still holds).
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment

          Working...
          X