Storing marginal effects after logistic regression in a loop

Luis Ortiz

Join Date: Dec 2014
Posts: 97

Storing marginal effects after logistic regression in a loop

05 Mar 2020, 15:01

Dear all,

I am rather desperate to get the marginal effect of father’s education on the probability of college graduation after running a logistic regression for a number of countries that participated in the PIAAC survey carried out by OECD.

The model is quite simple: the dependent variable is binary, my main independent variable is father’s education (three categories) and I have two controls (age and gender). In principle, it should not be difficult. But I have to account for the complex sample design of PIAAC, which means using a number of weights provided in PIAAC data.

A Stata module (repest) was specifically created for this purpose: “repest estimates statistics using replicate weights (…) thus accounting for complex survey designs in the estimation of sample variances”. It is especially designed to for databases like IELS, PIAAC, PISA, TALIS…

‘Repest’ basically works as follows:

PHP Code:


repest svyname [if] [in] , estimate(cmd [,cmd_options]) [options]

Next, there is one of the examples provided by the authors in the corresponding help file of repest:

PHP Code:


repest PIAAC, estimate(stata: reg lnwage pvlit@ yrsqual) by(cnt)

Since I want to run the same model for a number of countries in PIAAC, I intend to create a loop that includes repest. But I also want to generate the marginal effect of father’s education after the logistic regression for each country, storing these marginal effects and then saving them in a different Stata file (dta).

At the end of repest help file, the authors provide a loop precisely for logit posestimation:

HTML Code:

    User-defined estimation command: 2. logit postestimation

        cap program drop mylogitmargins
        program define mylogitmargins, eclass
        syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)]
        tempname b m
        // compute logit regressions, store results in vectors
                logit `logit' [`weight' `exp'] `if' `in', `loptions'
                matrix `b'= e(b)
        // compute logit postestimation, store results in vectors
                if "`margins'" != "" | "`moptions'" != ""{
                        margins `margins', post `moptions'
                        matrix `m' = e(b)
                        matrix colnames `m' =  margins:
                        matrix `b'= [`b', `m']
                        }
        // post results
                ereturn post `b' 
        end
    . repest PISA, estimate(stata: mylogitmargins, logit(repeat pv@math escs ib1.st04q01) margins(st04q01) moptions(atmeans))

Yet, I do not know how to replicate this with my data and, in particular, how to make sure that the marginal effects of father’s education for each country is stored after each logit.

I have succeeded in making ‘repest’ work with my logit model. Next, I show a program so that the name of the country appear in the output, a replica of the program for logit post-estimation offered by the authors of repest and, finally, the loop where I introduce repest for the estimation of logit probabiities for each country:

Code:

egen cntryid3_group=group(cntryid3), label

program define pe
        if `"`0'"' != "" {
        display as text `"`0'"'
        `0'
        display("")
    }
end

        cap program drop mylogitmargins
        program define mylogitmargins, eclass
        syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)]
        tempname b m
        // compute logit regressions, store results in vectors
                logit `logit' [`weight' `exp'] `if' `in', `loptions'
                matrix `b'= e(b)
        // compute logit postestimation, store results in vectors
                if "`margins'" != "" | "`moptions'" != ""{
                        margins `margins', post `moptions'
                        matrix `m' = e(b)
                        matrix colnames `m' =  margins:
                        matrix `b'= [`b', `m']
                        }
        // post results
                ereturn post `b' 
        end


foreach i of numlist 1/24 {
       display "`: label (cntryid3_group) `i''"
       pe capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath female age if cntryid3_group==`i' & egresados==1) margins(r.edufath))
       }

But I have not succeeded in generating the marginal effect of father’s education and storing them after the logistic regression for each country

Next, I show the results (output) for the second country of the list. The last two lines in the output are precisely the contrast of marginal effects for the three categories of father's education (second versus first, third versus first). It's what I want; yet, I do not know how to store them for each country, and how to retrieve them afterwards.

HTML Code:

capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath 
> female age if cntryid3_group==2 & egresados==1) margins(r.edufath))
(note: file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp not found)
file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp saved

_pooled.
 : _pooled
----------------------------------------------------------------------------------
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
univ_1b_edufat~r |          0  (omitted)
univ_2_edufather |   1.384978   .1785063     7.76   0.000     1.035112    1.734844
univ_3_edufather |   2.719534   .2327599    11.68   0.000     2.263333    3.175735
     univ_female |   .0569396   .1659537     0.34   0.732    -.2683237     .382203
        univ_age |  -.0004378   .0151277    -0.03   0.977    -.0300876    .0292121
      univ__cons |  -2.912376   .5037946    -5.78   0.000    -3.899795   -1.924956
margins_r2vs1_~r |   .1281268   .0182265     7.03   0.000     .0924035    .1638501
margins_r3vs1_~r |   .4029726   .0477158     8.45   0.000     .3094514    .4964938
----------------------------------------------------------------------------------

Could you help me with this?

Thanks for your attention

Luis Ortiz

PD: In case it could be of any use, I include a sample of my data, extracted from my dataset using dataex:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input double edufather float(age female univ cntryid3_group)
1 99 1 1 1
2 99 0 1 1
3 99 1 1 1
1 99 1 0 1
1 99 1 0 1
2 99 0 0 1
3 99 0 1 1
3 99 0 0 1
3 99 1 0 1
1 99 1 0 1
3 99 0 0 1
1 99 1 0 1
2 99 1 0 1
3 99 1 1 1
3 99 1 1 1
2 99 0 0 1
1 99 0 0 1
3 99 1 1 1
2 99 1 0 1
2 99 0 1 1
3 99 0 0 1
1 99 0 0 1
3 99 1 1 1
3 99 1 0 1
1 99 1 0 1
2 99 0 0 1
2 99 1 1 1
1 99 0 0 1
. 99 0 0 1
1 99 0 0 1
1 99 1 1 1
1 99 1 0 1
1 99 0 0 1
2 99 1 0 1
1 99 0 0 1
2 99 0 1 1
3 99 1 0 1
3 99 0 0 1
2 99 0 0 1
1 99 1 0 1
. 99 0 0 1
2 99 0 0 1
3 99 0 0 1
1 99 1 0 1
2 99 0 1 1
3 99 0 1 1
3 99 0 1 1
3 99 1 1 1
3 99 0 0 1
2 99 0 0 1
3 99 1 0 1
1 99 0 0 1
2 99 0 0 1
2 99 1 0 1
1 99 0 0 1
. 99 0 0 1
. 99 0 0 1
2 99 0 0 1
1 99 1 0 1
1 99 1 1 1
1 99 0 0 1
1 99 1 0 1
2 99 0 0 1
2 99 1 0 1
1 99 0 0 1
1 99 1 0 1
3 99 0 1 1
1 99 0 0 1
2 99 1 0 1
1 99 0 1 1
2 99 0 1 1
1 99 0 0 1
1 99 0 0 1
2 99 1 0 1
1 99 1 0 1
2 99 1 0 1
1 99 0 0 1
2 99 0 1 1
3 99 0 0 1
3 99 1 0 1
3 99 1 0 1
1 99 0 0 1
3 99 1 1 1
3 99 0 1 1
3 99 0 0 1
2 99 0 0 1
3 99 0 0 1
2 99 0 0 1
2 99 1 0 1
1 99 0 0 1
3 99 1 0 1
1 99 0 0 1
2 99 1 1 1
3 99 1 0 1
2 99 0 0 1
. 99 0 0 1
1 99 0 0 1
1 99 0 0 1
1 99 0 0 1
2 99 0 0 1
end
label values edufather edu_fat
label def edu_fat 1 "ISCED 1/2/3sh", modify
label def edu_fat 2 "ISCED 3/4", modify
label def edu_fat 3 "ISCED 5/6", modify
label values female gndr
label def gndr 0 "Male", modify
label def gndr 1 "Female", modify
label values univ univ_lab
label def univ_lab 0 "No uni", modify
label def univ_lab 1 "Univ", modify
label values cntryid3_group cntryid3_group
label def cntryid3_group 1 "124. Canada", modify

Tags: foreach, loop, margins, repest, store

Jesse Wursten

Join Date: Jan 2016

Posts: 915
#2

05 Mar 2020, 16:23

There are a bit too many steps to this problem for me to actually try to replicate it, however I'm not entirely sure where the problem arises?

There are three steps to do what you want
1) Get the actual estimates (e.g. doing the estimates)
2) Make those estimates available to Stata (e.g. as a variable, matrix, local, ...)
3) Storing the estimates in an outside file

Which of the steps are you stuck on?
Comment
Luis Ortiz

Join Date: Dec 2014

Posts: 97
#3

06 Mar 2020, 01:47

Thanks for your reply, Jesse

I believe I'm stuck on the step 2 and 3.

If you see the last table in my initial post, you'll notice (in the last two lines) that the marginal effects are there, at the bottom of the table are there: margins_r2vs1_~r & margins_r3vs1_~r . I do not know well why they appear there, listed immediately after the covariates.

Yet, I do not know how to store the marginal effects in a file that can be later retrieved as a Stata file.

Thanks for your attention again

Best

Luis
Comment
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#4

06 Mar 2020, 13:59

It is often helpful to get something running properly before you start wrapping it in a bunch of programs and loops.

If after margin statement you issue the statement margins,coefl, you will see how to refer to those margins. You can write those into a matrix anywhere you want. Note that if you're doing this in a loop you're going to have to either have a counter or something to make sure that each iteration has a different name for the matrix or set up to add rows or whatever to the matrix each time.
Comment

Announcement

Storing marginal effects after logistic regression in a loop

Comment

Comment

Comment