Multilevel Regression with plausible values as dependent variable

Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#16

28 Aug 2015, 15:43

Philip: very interesting paper with some disturbing results for PISA rankings fans -- many thanks for the reference. (I recall struggling with the PISA Technical Report about 10 years ago ...)

it is interesting/amusing to think about the (implied) rationale for imputing all values of a variable that was completely missing in the first place...

Yes! Intriguing that "draws from an (estimated) distribution of a latent trait" have now morphed into being treated as "imputed values of a missing variable". [It's the conflation of "missing" and "latent"!] But, as I said, the pv suite does appear to do this. I put such reservations aside in my attempts to try and provide a "solution" to the problem in this case which I interpreted as "how to get the data into Stata and recognised as MI?" I guess that that is what the SAS and SPSS code that Vincent cites does as well.
1 like
Comment
Vincent Puchades

Join Date: Aug 2015

Posts: 6
#17

29 Aug 2015, 12:10

Hi everyone,

Stephen, thank you so much. I ran the commands you mentioned above. It seems that this time is working -If I understood well and at least this time I did not get any error message-. As you will be able to see, this is what I got from Stata after doing what you suggested. I am afraid I got lost in this process -imputation of the 5 plausible values-.

Code:

. gen pv0math=. (23719 missing values generated) . mi import wide, imputed(pv0math = pv1math pv2math pv3math pv4math pv5math) clear · mi describe
· mi varying
*Here, If I understood, I should run the multilevel with just pv0math (but I am not sure about it)¿? . mi estimate: xtmixed pv0math gender immigrant wealth home_resources cultpos occupation parents_education attitudes || schoolid:

This issue is very interesting/amusing as Philip said before, but at the same time it is pretty hard for a beginner user in Stata and in the research world as it is in my case. I am not an expert on mi too, I am just trying to understand how to deal with these kind of problems and to improve my understanding about it.

Philip the reference you gave us is interesting -I will definitely use it- , I did not know about that and also about the unusual PISA database. Thank you so much as well would sound better.
Attached Files

Last edited by Vincent Puchades; 29 Aug 2015, 12:33.
Comment
Vincent Puchades

Join Date: Aug 2015

Posts: 6
#18

30 Aug 2015, 12:22

Hi,

I would like to add the following information to my previous post.

According to the Pisa Data Analysis Manual( http://www.keepeek.com/Digital-Asset...275-7-en#page9)

Population statistics should be estimated using each plausible value separately. The reported population statistic is then the average of each plausible value statistic. For instance, if one is interested in the correlation coefficient between the social index and the reading performance in PISA, then five correlation coefficients should be computed and then averaged. (PISA Data Analysis Manual: SPSS, Second Edition, 2009)

Any idea how could I do that in Stata? Do not forget that I already tried the pv command but I got the results without p-values. I did not introduce this as a new topic because we are facing the same problem (multilevel regression with plausible values as dependent variable).

Furthermore, I also found this pretty interesting. The idea comes from the following link: https://www.cmm.bristol.ac.uk/forum/viewtopic.php?t=955

First I artificially make some of the data missing and then create two imputed datasets.

I then fit the model of interest to each imputed dataset and combine the results using Rubin's rules. (Leckie, 2011)

To register the plausible values as imputed valued and use the mi command afterwards by doing this:

Code:

. replace pv1math=. if runiform()<=0.0001 (3 real changes made, 3 to missing) . replace pv2math=. if runiform()<=0.0001 (5 real changes made, 5 to missing) . replace pv3math=. if runiform()<=0.0001 (2 real changes made, 2 to missing) . replace pv4math=. if runiform()<=0.0001 (2 real changes made, 2 to missing) . replace pv5math=. if runiform()<=0.0002 (2 real changes made, 2 to missing) · mi set mlong · mi register imputed pv1math pv2math pv3math pv4math pv5math

What do you think about this idea? -or there is not need for that?- I really would like to know what do you think about it.

Thank you so much
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1435
#19

30 Aug 2015, 16:25

I don't understand why you're asking these sorts of questions again. It appeared from your post in #17 that you had got your data recognised by Stata's mi suite, and so you could then use commands like mi estimate: regress etc. The latter procedure does exactly what the quotation from the Bristol website suggest, i.e.:

... fit the model of interest to each imputed dataset and combine the results using Rubin's rules. (Leckie, 2011)

It's also what the PISA Data Analysis Manual that you cite suggests.

Sorry, but I can't offer you more advice on this topic. Not only have I exhausted my knowledge but I'm going to be without internet access for a while. My farewell advice is: do some background reading on multiple imputation and how it works before returning to your data. [There are good general references around (with citations in Stata's reference manual).] I conjecture that you wouldn't be sending out additional posts like #18 if you had such reading done.

I'd appreciate it if you followed up my requests early on to you to read the FAQ and, for instance, to re-register to use your full name (firstname lastname), and to use CODE delimiters to post output to the Forum for all output. Thank you.
1 like
Comment
Vicent Puchades

Join Date: Aug 2015

Posts: 1
#20

31 Aug 2015, 02:55

Hi everybody,

With respect to post in #18 I thought that here was the appropriate place to talk about the statistical and other issues raised by using Stata. As I found that link interesting, I wanted to share it here -as an alternative way of dealing with the plausible values as dependent variables- and to know your technical opinion/suggestion about it, since I consider that plausible values and PISA are an areas for future additions to or improvements to Stata, in comparison with other statistical packages.

As you said post in #17 Stata recognizes mi suite. So thank you so much again and as a student, I really appreciate that someone like you is taking the time to answer our questions.

Last edited by Vicent Puchades; 31 Aug 2015, 03:01.
Comment
Maleika Krüger

Join Date: Oct 2014

Posts: 19
#21

01 Sep 2015, 03:55

Hi everyone,

wow this thread has grown since I last had the time to check it.

Since I was asked directely about what I did to solve the Problem, I wanted to give a short answer:
I actually didn't Since I couldn't find a real solution and the analysis was more of an excercise to get myself familiar with the mi command and the process of imputation in Stata in general, I decided to disregard the multilevel structure and the plausible values for now and do some easier analysis. So I only used one plausible value as the dependent variable (which was suiting me just fine, since the final dataset I will be using for my analysis, will not have pausible values as the variable of interest) and just worked on understanding the general way of producing imputations and using the mi command to combine the results.

I'm really sorry I can't be of any help here.
Also I didn't work with the PISA data, so I didn't face the Problem with a non-existing original variable.

Minka
1 like
Comment
Laura Hernandez

Join Date: Dec 2015

Posts: 1
#22

03 Dec 2015, 01:56

Hi everyone,

I'm new here, so let me first excuse myself if I don't follow all the basic rules of the forum. I am a Stata 12.1 (64-bit) user. This thread has been very helpful to me in order to learn how to perform in Stata a PISA multilevel analysis with plausible values by means of multiple imputation. However, I have a question regarding the svyset of the data to be performed before proceeding with the multiple imputation and multilevel regressions.

The usual svyset command for PISA that I have found are usually performed is the following using the students' weights "w_fstuwt" (basically the same that Vincent Puchades posted several months ago):

Code:

svyset [pweight=w_fstuwt], brr(w_fstr*) vce(brr) fay(.5) mse

This code is taking into accound the students' weights, but what should I do in order to add a structure including also the school level? Students are nested into schools, which in PISA have also their own weights. In order to syvset the data including the schools I have found the following code:

Code:

svyset schoolid [pw=w_fstuwt], brr(w_fstr*) vce(brr) fay(.5) mse

However, in this last code I wonder why the school weights (w_fschwt) are not integrated.

Reading the svyset help for Stata 14.1 (http://www.stata.com/new-in-stata/mu...ls-survey-data) I have tried the following but it doesn't work for me with my Stata 12.1 version:

Code:

svyset schoolid, weight(w_fschwt) || _n, weight(w_fstuwt)

I get the following error message:

Code:

option weight() not allowed

Of course I would also like to svyset my data including the specificities of PISA (brr(w_fstr*) vce(brr) fay(.5) mse). Reading the svyset help for Stata 12.1 and taking also a look at the examples, I couldn't find anywhere in the syntaxis where to put the schools' weights in the svyset command.

I wonder if you could give me a hint on how to proceed with my code in order to svyset the data in order to include the PISA weighting scheme correctly, including schools and their weights.

I would really appreciate any help you can provide to me. Thanks in advance,

Laura

Last edited by Laura Hernandez; 03 Dec 2015, 01:59.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment