Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using analytical weight in STATA's mixed effects model

    I have a learning assessment dataset of over 60 countries and 2-5 years/waves. While countries are the same in all years, individuals (students) are different in each year. In other words, the data is cross-sectional at the student level. I use a two-step procedure to conduct country-level mixed-effects panel regression. At first, I regress student economic background on their math achievement for each country in each year using a simple OLS regression: achievement = a + economic_background + e.
    Or in STATA:

    Code:
    reg achievement economic_background
    The data structure is somewhat like the following-- different students are surveyed in different years from the same country:

    Code:
    student country year achievement economic_background
    101 1 2000 500 78
    201 1 2000 488 98
    106 1 2003 589 66
    407 1 2003 400 76
    Then, I use the coefficient of economic_background (named inequality_gradient) as the dependent variable at the second stage regressed by some country-level variables. I use a mixed-effects model using STATA's mixed command to do so. The model looks like the following:

    Code:
    mixed inequality_gradient var2 var3 || country:
    However, to get an unbiased standard error of the mixed-effects model at the second stage, I would like to weight the model by the inverse square of the standard error of economic_background coefficient found in the first OLS regression. To employ this weight named as gradient_se, I am trying to use STATA's analytical weight aweight option. But it seems like mixed command does not accept aweight option. Does anybody have any suggestion about how to incorporate these analytical weights in mixed command in any other ways?
    I have tried the following code but get an error:

    Code:
    mixed inequality_gradient var2 var3 [aw=gradient_se] || country:
    aweights not allowed
    r(101);
    I have also tried with pweight but since I only have weights at level 1 I get a warning saying that the results may be biased. But I do not have weights for countries at level 2. Can I incorporate the weights only at level 1 in a mixed model any other ways?

    The data structure at the second stage looks like the following:

    Code:
    country inequality_gradient gradient_se year var2 var3
    1 300 44 2000 1 3
    1 200 34 2000 1 3
    2 498 55 2003 2 2
    2 388 67 2003 4 1
    Please let me know if I need to make my problem clearer. I would be happy to do so.

  • #2
    I'm very confused. You say that inequality_gradient is calculated by regressing achievement on economic background in each country and taking the resulting coefficient. That implies that inequality_gradient must be constant within country. But in your data extract at the end of your post, it clearly varies within country. So there is some disconnect between what you are saying and what you are doing.

    I should add that if you did have inequality_gradient constant within country, I think you would be unable to even run your second stage model--I strongly doubt it would converge, because the within-country (residual) variance component would be identically zero.

    Please clarify.
    Last edited by Clyde Schechter; 29 Jul 2020, 12:56.

    Comment


    • #3
      Hi Clyde,

      Thank you for pointing this out. Sorry, it was a mistake. Years are different for each country: It should have been the year 2000 and 2003 for country 1 and year 2000 and 2003 for country 2 respectively.

      Comment


      • #4
        So, you don't say it in so many words, but you imply that you actually calculate the inequality_gradient separately for each country#year combination. Is that right?

        Assuming it is, there is nothing exactly analogous to -aweight- that you can use in -mixed-. However, -mixed- does support heteroscedastic residual estimation. Since you are calculating your dependent variable at the country#year level, you could do this for your second stage:

        Code:
        egen cy = group(country year)
        mixed inequality_gradient var2 var3 || country:, residuals(, by(cy))
        Now, if the number of country-year pairs is large, this is going to be hugely computationally intensive, and it may be difficult or impossible to obtain convergence. But that's the best I can think of to do.

        It's possible somebody else can think of a better way to do this, and if so, I hope he or she will chime in.

        Comment


        • #5
          Hi Clyde,
          Thanks for your helpful reply. Yes, I calculate inequality_gradient separately for each country#year combination: approximately 260 country-year pairs for 69 countries.
          You are right the code takes a long time and does not finish computing the code.

          In case someone has relevant experience, I also wanted to note that I have also tried using runmwlin (MLwiN in stata environment for multilevel modelling). runmwlin takes weights and gives no warning of the results being biased. But I am not sure whether runmlwin weight would be appropriate for my case. One notable aspect of runmlwin weighting is that the result does not considerably differ from unweighted regression.

          The codes are the following.

          Code:
          runmlwin cons inequality_gradient var2 var3, level2(country: cons) level1(year: cons, weightvar(gradient_se)) fpsandwich rpsandwich nopause

          Comment


          • #6
            You were advised on Stack Overflow https://stackoverflow.com/questions/...-effects-model to post here and that was good advice, but our own cross-posting policy is explained in the FAQ Advice: you are asked to tell us about it.

            Comment


            • #7
              Update: the post on Stack Overflow has now been deleted, so is not generally visible.

              Comment

              Working...
              X