TIMSS Plausible values

Ahmed Barry

Join Date: Mar 2018

Posts: 3
#1

TIMSS Plausible values

11 Mar 2018, 08:34

Dear members of the forum,
Would anyone of you be kind to help me get the command to analyze TIMSS data using all five plausible values. I want to run, in STATA, student achievement in math (dependent variable) using all five plausible values by sex, or teacher education level or any other independent variables.
All your efforts and time are appreciated.
Barry
Tags: None
Emil Smith

Join Date: Mar 2015

Posts: 8
#2

11 Mar 2018, 12:27

I don't believe that there is a specific command for that. But you are correct that you need to perform your calculations using all five plausible values to estimate the standard errors correctly. The workaround i have used in the past is to estimate e.g. a regression model five times (one with each of the plausible values as the dependent variable) and then pasting the results to excel to calculate the mean of the estimates. See "International Large-Scale Assessment Data: Issues in Secondary Analysis and Reporting " by Rutkowski et al. (2010) for more information.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3464
#3

11 Mar 2018, 12:32

Better yet use Stata's mi suit of commands, i.e. treat the plausible values as imputations.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#4

11 Mar 2018, 12:35

Ahmed, in general, it's best to assume that people on a general forum like Statalist won't be familiar with any specific dataset. I have no idea what TIMSS stands for. That said, it sounds like you will need to set the dataset up for multiple imputation. In the command line, type:

Code:

help mi

If TIMSS are weighted survey data, then you're probably familiar with the -svyset- set of commands. The -mi- set of commands are similar in that you need to declare the data as multiply imputed, and then prefix any estimation commands with -mi estimate:- (this stacks with the -svy:- prefix, I believe).

The general advice I've heard is that 5 multiply imputed datasets are too few. Paul Allison offers a general guide here. If you're just starting out in MI, I would start by just setting the data up as is (i.e. with the 5 imputations you were already provided). As you get more familiar with the concept of MI, you could consider if you need to impute the data yourself.

(Crossed with Maarten's post)

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
Ahmed Barry

Join Date: Mar 2018

Posts: 3
#5

12 Mar 2018, 01:57

I really thank you all for your contributions. It is my hope that someone comes up with an easy way to help analyze TIMSS, PIRLS, PISA etc. data with plausible values.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3464
#6

12 Mar 2018, 02:38

Originally posted by Ahmed Barry View Post

It is my hope that someone comes up with an easy way to help analyze TIMSS, PIRLS, PISA etc. data with plausible values.

I don't understand that remark. Are you saying that that easy way was not provided?

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Ahmed Barry

Join Date: Mar 2018

Posts: 3
#7

12 Mar 2018, 08:01

Hi Maarten,
As you may know, the Trends in International Mathematics and Science Study (TIMSS) reports student achievement with five plausible values (pv1, pv2, pv3, pv4, pv5). Analysis using one is not recommended and averaging them is not recommended. You have to use all of them in the analysis. Some having difficulty using them, use different approach to go around; which consists of making the analysis with each plausible value separately and then average the variances. Would you explain more about the mi you mentioned. Lets say I have five plausible values of student achievement in math and I would like to run an Anova to compare the mean achievement differences based on gender, subject value, expectation of academic attainment.
Thank you again for your effort and time.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3464
#8

12 Mar 2018, 08:27

The basic structure would be to tell Stata which variables it should treat is imputed/plausible values using mi import and estimate your model with mi estimate.

The devil in such cases is usually in the details, so you probably need to do some extensive reading in the help-files and the manual. At the top of each help file you see a link to the manual entry (view complete PDF manual entry). The latter is usually much more detailed, and contains discussions of worked out examples, etc. So if you are not familiar with mi you'll probably need to read those to, and the general description of mi in both help mi and the [MI] intro substanitive manual entry.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Philip Matthews

Join Date: Apr 2014

Posts: 23
#9

12 Mar 2018, 10:34

Ahmed, you may find the user-written package pv to be of help. Type net describe pv into Stata's Command window. You should see a brief descritpion of the package. If you want to download the package, type net install pv. The package help file contains details, for example, of how to perform regression with plausible values using TIMSS or PISA data. (Type help pv if/when you have downloaded the package.)
Best wishes, Philip
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#10

12 Mar 2018, 13:40

Originally posted by Ahmed Barry View Post

Hi Maarten,
As you may know, the Trends in International Mathematics and Science Study (TIMSS) reports student achievement with five plausible values (pv1, pv2, pv3, pv4, pv5). Analysis using one is not recommended and averaging them is not recommended. You have to use all of them in the analysis. Some having difficulty using them, use different approach to go around; which consists of making the analysis with each plausible value separately and then average the variances. Would you explain more about the mi you mentioned. Lets say I have five plausible values of student achievement in math and I would like to run an Anova to compare the mean achievement differences based on gender, subject value, expectation of academic attainment.
Thank you again for your effort and time.

I glanced at the description for the pv package that Philip mentioned. It looks like it is specifically written to work with TIMSS and a few other datasets, so this should work well for you. But briefly, the mi series of commands can impute missing variables using a variety of imputation formulas, including chained equations, run estimation commands on the datasets, then combine them using Rubin's rules for both the betas and their standard errors. If the dataset already contained 5 imputations, you could skip the first step and tell mi to just impute those 5 imputations.

I know the documentation for mi is rather long, but I think this would have been covered in the introduction if you'd gone and read it as Maarten and I suggested. It ultimately uses the same theory that pv does. Maybe the only real difference is that, as pv's documentation says,

The IEA recommends a different formula for calculating the standard error of the statistic for TIMSS and PIRLS data (IEA 2005: pp 2-52) which is implemented by specifying the timss or pirls option.

I don't know what this refers to, as I'm not familiar with that dataset.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment
wbuchanan

Join Date: Mar 2014

Posts: 1362
#11

17 Mar 2018, 05:45

It could be the case the TIMSS and PIRLS use different sampling methods. The advice provided by Maarten previously is part of the answer. The remaining part of the answer is that you also need to survey set the data using the correct weights and estimator. With NAEP results, they recommend using BRR standard errors with one set of weights for some applications while recommending jackknifed standard errors for other applications. There isn’t truly a “simple” solution beyond saying that as a researcher you need to make sure you account for the sampling design and are familiar enough with plausible values to know that they are derived from Rubin’s original work on multiple imputations.
Comment
Luis Muelle

Join Date: Sep 2014

Posts: 10
#12

11 Apr 2018, 02:31

I think that the IEA IDB analyzer is suited to your needs. You can use it to combine and analyze data from all IEA’s large-scale assessments, as well as analyze data from most major large-scale, included TIMSS of course. You can download this software and User's guide at the IEA-The International Association for the Evaluation of Educational Achievement, web site.
Comment

Announcement

TIMSS Plausible values

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment