Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weighting regressions

    Hi,

    My regression involves forming skill groups from time series data and conducting a panel analysis of these skill groups over time. I calculate the average wages in each skill group. So far, i've weighted the wage by multiplying by the group's weight in that year divided by the total weight. I 'm not sure if i need to weight the whole regression. I've read about aweights. However, i also have a variable that is a percentage of the data i.e the percentage of immigrants in each skill group, so i'm unsure if i can still use aweights. Also, i've calculated the weights based on the sample to calculate the mean wage but i later do logwage, so i'm not sure how to weight this using p weights.

    My regression i've tried looks like:
    regress logwage0 Immigshock1 i.c_id i.Education i.Experience i.Year [aweight=Weight],cluster(Group)

    logwage0 is the unweighted wage compared to logwage1 which i've used so far that is manually weighted. Immigshock is the variable that is the percentage of immigrants in the original dataset for a certain skill group.

    Please could you help with this. Thank you

  • #2
    Hi, additionally in my previous regression, i believe was over specified as i kept getting very large coefficients and a high r squared value. i have 142 observations. 6 years and 27 skill groups. I have dummy variables for education and experience which are categorical variables as well dummy variables for time. The paper i'm following later also has interactions between these dummy variables. Is there a number of observations i require to avoid over specification. i'm not sure if increasing the number of years will help. Also when i've done my regression before, education and experience variables keep getting removed due to multi col-linearity. The model i am following is common in literature and i don't understand the reason for this


    2.Education omitted because of collinearity
    note: 3.Education omitted because of collinearity
    note: 2.Experience omitted because of collinearity
    note: 3.Experience omitted because of collinearity
    note: 4.Experience omitted because of collinearity
    note: 5.Experience omitted because of collinearity
    note: 6.Experience omitted because of collinearity
    note: 7.Experience omitted because of collinearity
    note: 8.Experience omitted because of collinearity
    note: 9.Experience omitted because of collinearity

    Please could you help me with this

    Comment


    • #3
      Arjun:
      you may want to take a look at Portnoy's suggested ratio: https://projecteuclid.org/euclid.aos/1176346793) (thanks once more to Joao Santos Silva for mentioning this article some time ago).
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Hi Carlo,

        Thank you for the help. Could you briefly explain what it is saying if possible? I've read through it and am struggling to understand it. Just a bit stressed too so my mind isn't working as well. Thank you

        Comment


        • #5
          Arjun:
          in brief, Portnoy's suggested ratio (https://projecteuclid.org/euclid.aos/1176346793) states that [(constant+predictors)^2]/sample size should ideally go to 0 as the sample size goes to infinity.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment

          Working...
          X