Dear all,
I am rather desperate to get the marginal effect of father’s education on the probability of college graduation after running a logistic regression for a number of countries that participated in the PIAAC survey carried out by OECD.
The model is quite simple: the dependent variable is binary, my main independent variable is father’s education (three categories) and I have two controls (age and gender). In principle, it should not be difficult. But I have to account for the complex sample design of PIAAC, which means using a number of weights provided in PIAAC data.
A Stata module (repest) was specifically created for this purpose: “repest estimates statistics using replicate weights (…) thus accounting for complex survey designs in the estimation of sample variances”. It is especially designed to for databases like IELS, PIAAC, PISA, TALIS…
‘Repest’ basically works as follows:
Next, there is one of the examples provided by the authors in the corresponding help file of repest:
Since I want to run the same model for a number of countries in PIAAC, I intend to create a loop that includes repest. But I also want to generate the marginal effect of father’s education after the logistic regression for each country, storing these marginal effects and then saving them in a different Stata file (dta).
At the end of repest help file, the authors provide a loop precisely for logit posestimation:
Yet, I do not know how to replicate this with my data and, in particular, how to make sure that the marginal effects of father’s education for each country is stored after each logit.
I have succeeded in making ‘repest’ work with my logit model. Next, I show a program so that the name of the country appear in the output, a replica of the program for logit post-estimation offered by the authors of repest and, finally, the loop where I introduce repest for the estimation of logit probabiities for each country:
But I have not succeeded in generating the marginal effect of father’s education and storing them after the logistic regression for each country
Next, I show the results (output) for the second country of the list. The last two lines in the output are precisely the contrast of marginal effects for the three categories of father's education (second versus first, third versus first). It's what I want; yet, I do not know how to store them for each country, and how to retrieve them afterwards.
Could you help me with this?
Thanks for your attention
Luis Ortiz
PD: In case it could be of any use, I include a sample of my data, extracted from my dataset using dataex:
I am rather desperate to get the marginal effect of father’s education on the probability of college graduation after running a logistic regression for a number of countries that participated in the PIAAC survey carried out by OECD.
The model is quite simple: the dependent variable is binary, my main independent variable is father’s education (three categories) and I have two controls (age and gender). In principle, it should not be difficult. But I have to account for the complex sample design of PIAAC, which means using a number of weights provided in PIAAC data.
A Stata module (repest) was specifically created for this purpose: “repest estimates statistics using replicate weights (…) thus accounting for complex survey designs in the estimation of sample variances”. It is especially designed to for databases like IELS, PIAAC, PISA, TALIS…
‘Repest’ basically works as follows:
PHP Code:
repest svyname [if] [in] , estimate(cmd [,cmd_options]) [options]
Next, there is one of the examples provided by the authors in the corresponding help file of repest:
PHP Code:
repest PIAAC, estimate(stata: reg lnwage pvlit@ yrsqual) by(cnt)
Since I want to run the same model for a number of countries in PIAAC, I intend to create a loop that includes repest. But I also want to generate the marginal effect of father’s education after the logistic regression for each country, storing these marginal effects and then saving them in a different Stata file (dta).
At the end of repest help file, the authors provide a loop precisely for logit posestimation:
HTML Code:
User-defined estimation command: 2. logit postestimation cap program drop mylogitmargins program define mylogitmargins, eclass syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)] tempname b m // compute logit regressions, store results in vectors logit `logit' [`weight' `exp'] `if' `in', `loptions' matrix `b'= e(b) // compute logit postestimation, store results in vectors if "`margins'" != "" | "`moptions'" != ""{ margins `margins', post `moptions' matrix `m' = e(b) matrix colnames `m' = margins: matrix `b'= [`b', `m'] } // post results ereturn post `b' end . repest PISA, estimate(stata: mylogitmargins, logit(repeat pv@math escs ib1.st04q01) margins(st04q01) moptions(atmeans))
Yet, I do not know how to replicate this with my data and, in particular, how to make sure that the marginal effects of father’s education for each country is stored after each logit.
I have succeeded in making ‘repest’ work with my logit model. Next, I show a program so that the name of the country appear in the output, a replica of the program for logit post-estimation offered by the authors of repest and, finally, the loop where I introduce repest for the estimation of logit probabiities for each country:
Code:
egen cntryid3_group=group(cntryid3), label program define pe if `"`0'"' != "" { display as text `"`0'"' `0' display("") } end cap program drop mylogitmargins program define mylogitmargins, eclass syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)] tempname b m // compute logit regressions, store results in vectors logit `logit' [`weight' `exp'] `if' `in', `loptions' matrix `b'= e(b) // compute logit postestimation, store results in vectors if "`margins'" != "" | "`moptions'" != ""{ margins `margins', post `moptions' matrix `m' = e(b) matrix colnames `m' = margins: matrix `b'= [`b', `m'] } // post results ereturn post `b' end foreach i of numlist 1/24 { display "`: label (cntryid3_group) `i''" pe capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath female age if cntryid3_group==`i' & egresados==1) margins(r.edufath)) }
Next, I show the results (output) for the second country of the list. The last two lines in the output are precisely the contrast of marginal effects for the three categories of father's education (second versus first, third versus first). It's what I want; yet, I do not know how to store them for each country, and how to retrieve them afterwards.
HTML Code:
capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath > female age if cntryid3_group==2 & egresados==1) margins(r.edufath)) (note: file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp not found) file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp saved _pooled. : _pooled ---------------------------------------------------------------------------------- | Coef. Std. Err. z P>|z| [95% Conf. Interval] -----------------+---------------------------------------------------------------- univ_1b_edufat~r | 0 (omitted) univ_2_edufather | 1.384978 .1785063 7.76 0.000 1.035112 1.734844 univ_3_edufather | 2.719534 .2327599 11.68 0.000 2.263333 3.175735 univ_female | .0569396 .1659537 0.34 0.732 -.2683237 .382203 univ_age | -.0004378 .0151277 -0.03 0.977 -.0300876 .0292121 univ__cons | -2.912376 .5037946 -5.78 0.000 -3.899795 -1.924956 margins_r2vs1_~r | .1281268 .0182265 7.03 0.000 .0924035 .1638501 margins_r3vs1_~r | .4029726 .0477158 8.45 0.000 .3094514 .4964938 ----------------------------------------------------------------------------------
Could you help me with this?
Thanks for your attention
Luis Ortiz
PD: In case it could be of any use, I include a sample of my data, extracted from my dataset using dataex:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input double edufather float(age female univ cntryid3_group) 1 99 1 1 1 2 99 0 1 1 3 99 1 1 1 1 99 1 0 1 1 99 1 0 1 2 99 0 0 1 3 99 0 1 1 3 99 0 0 1 3 99 1 0 1 1 99 1 0 1 3 99 0 0 1 1 99 1 0 1 2 99 1 0 1 3 99 1 1 1 3 99 1 1 1 2 99 0 0 1 1 99 0 0 1 3 99 1 1 1 2 99 1 0 1 2 99 0 1 1 3 99 0 0 1 1 99 0 0 1 3 99 1 1 1 3 99 1 0 1 1 99 1 0 1 2 99 0 0 1 2 99 1 1 1 1 99 0 0 1 . 99 0 0 1 1 99 0 0 1 1 99 1 1 1 1 99 1 0 1 1 99 0 0 1 2 99 1 0 1 1 99 0 0 1 2 99 0 1 1 3 99 1 0 1 3 99 0 0 1 2 99 0 0 1 1 99 1 0 1 . 99 0 0 1 2 99 0 0 1 3 99 0 0 1 1 99 1 0 1 2 99 0 1 1 3 99 0 1 1 3 99 0 1 1 3 99 1 1 1 3 99 0 0 1 2 99 0 0 1 3 99 1 0 1 1 99 0 0 1 2 99 0 0 1 2 99 1 0 1 1 99 0 0 1 . 99 0 0 1 . 99 0 0 1 2 99 0 0 1 1 99 1 0 1 1 99 1 1 1 1 99 0 0 1 1 99 1 0 1 2 99 0 0 1 2 99 1 0 1 1 99 0 0 1 1 99 1 0 1 3 99 0 1 1 1 99 0 0 1 2 99 1 0 1 1 99 0 1 1 2 99 0 1 1 1 99 0 0 1 1 99 0 0 1 2 99 1 0 1 1 99 1 0 1 2 99 1 0 1 1 99 0 0 1 2 99 0 1 1 3 99 0 0 1 3 99 1 0 1 3 99 1 0 1 1 99 0 0 1 3 99 1 1 1 3 99 0 1 1 3 99 0 0 1 2 99 0 0 1 3 99 0 0 1 2 99 0 0 1 2 99 1 0 1 1 99 0 0 1 3 99 1 0 1 1 99 0 0 1 2 99 1 1 1 3 99 1 0 1 2 99 0 0 1 . 99 0 0 1 1 99 0 0 1 1 99 0 0 1 1 99 0 0 1 2 99 0 0 1 end label values edufather edu_fat label def edu_fat 1 "ISCED 1/2/3sh", modify label def edu_fat 2 "ISCED 3/4", modify label def edu_fat 3 "ISCED 5/6", modify label values female gndr label def gndr 0 "Male", modify label def gndr 1 "Female", modify label values univ univ_lab label def univ_lab 0 "No uni", modify label def univ_lab 1 "Univ", modify label values cntryid3_group cntryid3_group label def cntryid3_group 1 "124. Canada", modify
Comment