Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • xtmixed using survey data

    Hi all, I am currently working on a large survey data which contains sample weights. The data is on individual student level and each student has achievement data for five time points. Now I need to run a longitudinal analysis on the data. So I firstly reshape the data into a long format. My adviser used a log-transferred time variable as one of the independent variables he put in the following syntax:

    xtmixed READ LNAGE5 EASIA_CHILD_BH SEASIA_CHILD_BH SASIA_CHILD_BH USASIA_CHILD_BH BLACK HISP OTHER EASIAxLNAGE5 SEASIAxLNAGE5 SASIAxLNAGE5 /// USASIAxLNAGE5 BLACKxLNAGE5 HISPxLNAGE5 OTHERxLNAGE5 || SCHID:, || CHILDID: LNAGE5, mle cov(un) var pweight(C1_7FC0)

    READ is the outcome variable, and LNAGE5 is the log-transferred variable of time (age of students when they took the test). I am not quite sure why he needs to do so but I did try running the syntax using the original time variable but stata gives me error messages. Anyone knows why? The rest of the independent variables are student race, and the interaction between student race and time points. My second question is once I finish running the syntax, I have the output with coefficients for each of the independent variables along with the residuals. My adviser then hand calculated all the predicted values but he wants me to think of an easier way to get the predicted value at each time points. He mentioned the use of -margins- but I am having a hard time getting the results I want. Basically, I want to get all predicted values for each of the racial group (east asian, south asian, etc) by each of the time points, oddly the time points he listed are not integers, rather they are in the log-transferred form, which makes it impossible for -margins- to recognize. Any clues for me please?

    Thank you!

  • #2
    Hello Man,

    Welcome to the Stata Forum,

    As recommended in the FAQ, please present the command and output under CODE delimiters. Also, about the fact that "Stata gives error messages", please present what you typed and what message Stata gave to your commands. There is also something I fail to understand: you said Stata gave you error messages, but it was not clear what change you did afterwards, for you had "the ouptput with coefficients for each of the independent variables along with the residuals" as well.

    That said, I wish to make two comments.

    Instead of xtmixed, the updated command is now mixed.

    If you wish to perform a survey under a multilevel model, and considering you may wish to include weights on different levels, maybe you should think about taking a look at svy: meglm command.

    Hopefully that helps.
    Last edited by Marcos Almeida; 21 Jan 2017, 03:09.
    Best regards,

    Marcos

    Comment


    • #3
      Hi Marcos, thanks and I am sorry that I didn't make my question clear enough. So the code actually works using xtmixed (including the sample weights) and I have no problem getting the coefficients for each of the independent variables I need. The command is as follows:

      Code:
      xtmixed READ LNAGE5 EASIA_CHILD_BH SEASIA_CHILD_BH SASIA_CHILD_BH USASIA_CHILD_BH BLACK HISP OTHER EASIAxLNAGE5 SEASIAxLNAGE5 SASIAxLNAGE5 USASIAxLNAGE5 BLACKxLNAGE5 HISPxLNAGE5 OTHERxLNAGE5 || SCHID:, || CHILDID: LNAGE5, mle cov(un) var pweight(C1_7FC0)
      As you might notice, it's a three level model with children nested in schools.

      The output is as follows:
      Code:
      Obtaining starting values by EM: 
      
      Performing gradient-based optimization: 
      
      Iteration 0:   log pseudolikelihood =   -4839797  
      Iteration 1:   log pseudolikelihood = -4689532.6  
      Iteration 2:   log pseudolikelihood = -4672405.2  
      Iteration 3:   log pseudolikelihood = -4669992.1  
      Iteration 4:   log pseudolikelihood =   -4669990  
      Iteration 5:   log pseudolikelihood =   -4669990  
      
      Computing standard errors:
      
      Mixed-effects regression                        Number of obs      =     31641
      
      
      No. of       Observations per Group
      Group Variable    Groups    Minimum    Average    Maximum
      
      SCHID      3579          1        8.8         96
      CHILDID     12348          1        2.6          6
      
      
      Wald chi2(15)      =  20786.19
      Log pseudolikelihood =   -4669990               Prob > chi2        =    0.0000
      
      (Std. Err. adjusted for 3579 clusters in SCHID)
      
      Robust
      READ       Coef.   Std. Err.      z    P>z     [95% Conf. Interval]
      
      LNAGE5     1.06943   .0096195   111.17   0.000     1.050577    1.088284
      EASIA_CHILD_BH    .2382712   .1489984     1.60   0.110    -.0537603    .5303028
      SEASIA_CHILD_BH    .0725828   .0635315     1.14   0.253    -.0519366    .1971021
      SASIA_CHILD_BH    .3105751    .089994     3.45   0.001     .1341902      .48696
      USASIA_CHILD_BH    .2321738   .0880707     2.64   0.008     .0595584    .4047893
      BLACK   -.0674376   .0374264    -1.80   0.072    -.1407921    .0059169
      HISP   -.0970125   .0362391    -2.68   0.007    -.1680399   -.0259851
      OTHER   -.0340885    .047777    -0.71   0.476    -.1277298    .0595527
      EASIAxLNAGE5   -.1346622   .0640212    -2.10   0.035    -.2601415   -.0091828
      SEASIAxLNAGE5   -.0654035   .0353169    -1.85   0.064    -.1346232    .0038163
      SASIAxLNAGE5   -.1435941   .0461835    -3.11   0.002    -.2341122    -.053076
      USASIAxLNAGE5   -.1594505   .0523861    -3.04   0.002    -.2621253   -.0567757
      BLACKxLNAGE5   -.0642827    .018541    -3.47   0.001    -.1006225    -.027943
      HISPxLNAGE5   -.0178508   .0190605    -0.94   0.349    -.0552088    .0195071
      OTHERxLNAGE5    .0084979   .0250205     0.34   0.734    -.0405414    .0575373
      _cons   -.8667657   .0185881   -46.63   0.000    -.9031977   -.8303337
      
      
      
      Robust           
      Random-effects Parameters     Estimate   Std. Err.     [95% Conf. Interval]
      
      SCHID: Identity              
      var(_cons)    .1220529   .0046531      .1132654    .1315222
      
      CHILDID: Unstructured        
      var(LNAGE5)    .0575539   .0026575      .0525739    .0630055
      var(_cons)    .2337419   .0090522      .2166566    .2521746
      cov(LNAGE5,_cons)    -.112925   .0046371     -.1220136   -.1038365
      
      var(Residual)    .0560878   .0016229      .0529955    .0593606
      As I mentioned, my first question is why I have to use the log-transferred time variables (LNAGE5) in the command. My second question is about the use of -margins-. Since I need to get the predicted value of each racial groups, so instead of hand calculating all the results, I think margins can do that in stata. So for example, if I want to know the predicted value of East Asian group at time point -0.69, then the command I used is as follows:

      Code:
      margins, at(LNAGE5=-0.69 EASIA_CHILD_BH=1) vsquish
      and the results are as follows:

      Code:
      . margins, at(LNAGE5=-0.69 EASIA_CHILD_BH=1) vsquish
      
      Predictive margins                                Number of obs   =    30330
      Model VCE    : Robust
      
      Expression   : Linear prediction, fixed portion, predict()
      at           : LNAGE5          =        -.69
      EASIA_CHIL~H    =           1
      
          
      Delta-method
      Margin   Std. Err.      z    P>z     [95% Conf.    Interval]
          
      _cons   -1.397645   .1477238    -9.46   0.000    -1.687178    -1.108111
      However, the results I got from the above command do not match with the results that my adviser hand calculated using growth curve modeling formula. So his answer for East Asian group at time -0.69 is -1.276 whereas mine is -1.3976.
      Last edited by Man Yang; 21 Jan 2017, 12:20.

      Comment

      Working...
      X