xtmixed using survey data

Man Yang

Join Date: Mar 2016

Posts: 183
#1

xtmixed using survey data

20 Jan 2017, 18:12

Hi all, I am currently working on a large survey data which contains sample weights. The data is on individual student level and each student has achievement data for five time points. Now I need to run a longitudinal analysis on the data. So I firstly reshape the data into a long format. My adviser used a log-transferred time variable as one of the independent variables he put in the following syntax:

xtmixed READ LNAGE5 EASIA_CHILD_BH SEASIA_CHILD_BH SASIA_CHILD_BH USASIA_CHILD_BH BLACK HISP OTHER EASIAxLNAGE5 SEASIAxLNAGE5 SASIAxLNAGE5 /// USASIAxLNAGE5 BLACKxLNAGE5 HISPxLNAGE5 OTHERxLNAGE5 || SCHID:, || CHILDID: LNAGE5, mle cov(un) var pweight(C1_7FC0)

READ is the outcome variable, and LNAGE5 is the log-transferred variable of time (age of students when they took the test). I am not quite sure why he needs to do so but I did try running the syntax using the original time variable but stata gives me error messages. Anyone knows why? The rest of the independent variables are student race, and the interaction between student race and time points. My second question is once I finish running the syntax, I have the output with coefficients for each of the independent variables along with the residuals. My adviser then hand calculated all the predicted values but he wants me to think of an easier way to get the predicted value at each time points. He mentioned the use of -margins- but I am having a hard time getting the results I want. Basically, I want to get all predicted values for each of the racial group (east asian, south asian, etc) by each of the time points, oddly the time points he listed are not integers, rather they are in the log-transferred form, which makes it impossible for -margins- to recognize. Any clues for me please?

Thank you!
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

21 Jan 2017, 03:07

Hello Man,

Welcome to the Stata Forum,

As recommended in the FAQ, please present the command and output under CODE delimiters. Also, about the fact that "Stata gives error messages", please present what you typed and what message Stata gave to your commands. There is also something I fail to understand: you said Stata gave you error messages, but it was not clear what change you did afterwards, for you had "the ouptput with coefficients for each of the independent variables along with the residuals" as well.

That said, I wish to make two comments.

Instead of xtmixed, the updated command is now mixed.

If you wish to perform a survey under a multilevel model, and considering you may wish to include weights on different levels, maybe you should think about taking a look at svy: meglm command.

Hopefully that helps.

Last edited by Marcos Almeida; 21 Jan 2017, 03:09.

Best regards,

Marcos
Comment

Man Yang

Join Date: Mar 2016
Posts: 183

21 Jan 2017, 12:10

Hi Marcos, thanks and I am sorry that I didn't make my question clear enough. So the code actually works using xtmixed (including the sample weights) and I have no problem getting the coefficients for each of the independent variables I need. The command is as follows:

Code:

xtmixed READ LNAGE5 EASIA_CHILD_BH SEASIA_CHILD_BH SASIA_CHILD_BH USASIA_CHILD_BH BLACK HISP OTHER EASIAxLNAGE5 SEASIAxLNAGE5 SASIAxLNAGE5 USASIAxLNAGE5 BLACKxLNAGE5 HISPxLNAGE5 OTHERxLNAGE5 || SCHID:, || CHILDID: LNAGE5, mle cov(un) var pweight(C1_7FC0)

As you might notice, it's a three level model with children nested in schools.

The output is as follows:

Code:

Obtaining starting values by EM: 

Performing gradient-based optimization: 

Iteration 0:   log pseudolikelihood =   -4839797  
Iteration 1:   log pseudolikelihood = -4689532.6  
Iteration 2:   log pseudolikelihood = -4672405.2  
Iteration 3:   log pseudolikelihood = -4669992.1  
Iteration 4:   log pseudolikelihood =   -4669990  
Iteration 5:   log pseudolikelihood =   -4669990  

Computing standard errors:

Mixed-effects regression                        Number of obs      =     31641


No. of       Observations per Group
Group Variable    Groups    Minimum    Average    Maximum

SCHID      3579          1        8.8         96
CHILDID     12348          1        2.6          6


Wald chi2(15)      =  20786.19
Log pseudolikelihood =   -4669990               Prob > chi2        =    0.0000

(Std. Err. adjusted for 3579 clusters in SCHID)

Robust
READ       Coef.   Std. Err.      z    P>z     [95% Conf. Interval]

LNAGE5     1.06943   .0096195   111.17   0.000     1.050577    1.088284
EASIA_CHILD_BH    .2382712   .1489984     1.60   0.110    -.0537603    .5303028
SEASIA_CHILD_BH    .0725828   .0635315     1.14   0.253    -.0519366    .1971021
SASIA_CHILD_BH    .3105751    .089994     3.45   0.001     .1341902      .48696
USASIA_CHILD_BH    .2321738   .0880707     2.64   0.008     .0595584    .4047893
BLACK   -.0674376   .0374264    -1.80   0.072    -.1407921    .0059169
HISP   -.0970125   .0362391    -2.68   0.007    -.1680399   -.0259851
OTHER   -.0340885    .047777    -0.71   0.476    -.1277298    .0595527
EASIAxLNAGE5   -.1346622   .0640212    -2.10   0.035    -.2601415   -.0091828
SEASIAxLNAGE5   -.0654035   .0353169    -1.85   0.064    -.1346232    .0038163
SASIAxLNAGE5   -.1435941   .0461835    -3.11   0.002    -.2341122    -.053076
USASIAxLNAGE5   -.1594505   .0523861    -3.04   0.002    -.2621253   -.0567757
BLACKxLNAGE5   -.0642827    .018541    -3.47   0.001    -.1006225    -.027943
HISPxLNAGE5   -.0178508   .0190605    -0.94   0.349    -.0552088    .0195071
OTHERxLNAGE5    .0084979   .0250205     0.34   0.734    -.0405414    .0575373
_cons   -.8667657   .0185881   -46.63   0.000    -.9031977   -.8303337



Robust           
Random-effects Parameters     Estimate   Std. Err.     [95% Conf. Interval]

SCHID: Identity              
var(_cons)    .1220529   .0046531      .1132654    .1315222

CHILDID: Unstructured        
var(LNAGE5)    .0575539   .0026575      .0525739    .0630055
var(_cons)    .2337419   .0090522      .2166566    .2521746
cov(LNAGE5,_cons)    -.112925   .0046371     -.1220136   -.1038365

var(Residual)    .0560878   .0016229      .0529955    .0593606

As I mentioned, my first question is why I have to use the log-transferred time variables (LNAGE5) in the command. My second question is about the use of -margins-. Since I need to get the predicted value of each racial groups, so instead of hand calculating all the results, I think margins can do that in stata. So for example, if I want to know the predicted value of East Asian group at time point -0.69, then the command I used is as follows:

Code:

margins, at(LNAGE5=-0.69 EASIA_CHILD_BH=1) vsquish

and the results are as follows:

Code:

. margins, at(LNAGE5=-0.69 EASIA_CHILD_BH=1) vsquish

Predictive margins                                Number of obs   =    30330
Model VCE    : Robust

Expression   : Linear prediction, fixed portion, predict()
at           : LNAGE5          =        -.69
EASIA_CHIL~H    =           1

    
Delta-method
Margin   Std. Err.      z    P>z     [95% Conf.    Interval]
    
_cons   -1.397645   .1477238    -9.46   0.000    -1.687178    -1.108111

However, the results I got from the above command do not match with the results that my adviser hand calculated using growth curve modeling formula. So his answer for East Asian group at time -0.69 is -1.276 whereas mine is -1.3976.

Last edited by Man Yang; 21 Jan 2017, 12:20.

Announcement

xtmixed using survey data

Comment

Comment