Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Storing marginal effects after logistic regression in a loop

    Dear all,

    I am rather desperate to get the marginal effect of father’s education on the probability of college graduation after running a logistic regression for a number of countries that participated in the PIAAC survey carried out by OECD.

    The model is quite simple: the dependent variable is binary, my main independent variable is father’s education (three categories) and I have two controls (age and gender). In principle, it should not be difficult. But I have to account for the complex sample design of PIAAC, which means using a number of weights provided in PIAAC data.

    A Stata module (repest) was specifically created for this purpose: “repest estimates statistics using replicate weights (…) thus accounting for complex survey designs in the estimation of sample variances”. It is especially designed to for databases like IELS, PIAAC, PISA, TALIS…

    ‘Repest’ basically works as follows:

    PHP Code:
    repest svyname [if] [in] , estimate(cmd [,cmd_options]) [options

    Next, there is one of the examples provided by the authors in the corresponding help file of repest:

    PHP Code:
    repest PIAACestimate(statareg lnwage pvlityrsqualby(cnt

    Since I want to run the same model for a number of countries in PIAAC, I intend to create a loop that includes repest. But I also want to generate the marginal effect of father’s education after the logistic regression for each country, storing these marginal effects and then saving them in a different Stata file (dta).

    At the end of repest help file, the authors provide a loop precisely for logit posestimation:

    HTML Code:
        User-defined estimation command: 2. logit postestimation
    
            cap program drop mylogitmargins
            program define mylogitmargins, eclass
            syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)]
            tempname b m
            // compute logit regressions, store results in vectors
                    logit `logit' [`weight' `exp'] `if' `in', `loptions'
                    matrix `b'= e(b)
            // compute logit postestimation, store results in vectors
                    if "`margins'" != "" | "`moptions'" != ""{
                            margins `margins', post `moptions'
                            matrix `m' = e(b)
                            matrix colnames `m' =  margins:
                            matrix `b'= [`b', `m']
                            }
            // post results
                    ereturn post `b' 
            end
        . repest PISA, estimate(stata: mylogitmargins, logit(repeat pv@math escs ib1.st04q01) margins(st04q01) moptions(atmeans))

    Yet, I do not know how to replicate this with my data and, in particular, how to make sure that the marginal effects of father’s education for each country is stored after each logit.

    I have succeeded in making ‘repest’ work with my logit model. Next, I show a program so that the name of the country appear in the output, a replica of the program for logit post-estimation offered by the authors of repest and, finally, the loop where I introduce repest for the estimation of logit probabiities for each country:

    Code:
    egen cntryid3_group=group(cntryid3), label
    
    program define pe
            if `"`0'"' != "" {
            display as text `"`0'"'
            `0'
            display("")
        }
    end
    
            cap program drop mylogitmargins
            program define mylogitmargins, eclass
            syntax [if] [in] [pweight], logit(string) [margins(string) loptions(string) moptions(string)]
            tempname b m
            // compute logit regressions, store results in vectors
                    logit `logit' [`weight' `exp'] `if' `in', `loptions'
                    matrix `b'= e(b)
            // compute logit postestimation, store results in vectors
                    if "`margins'" != "" | "`moptions'" != ""{
                            margins `margins', post `moptions'
                            matrix `m' = e(b)
                            matrix colnames `m' =  margins:
                            matrix `b'= [`b', `m']
                            }
            // post results
                    ereturn post `b' 
            end
    
    
    foreach i of numlist 1/24 {
           display "`: label (cntryid3_group) `i''"
           pe capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath female age if cntryid3_group==`i' & egresados==1) margins(r.edufath))
           }
    But I have not succeeded in generating the marginal effect of father’s education and storing them after the logistic regression for each country

    Next, I show the results (output) for the second country of the list. The last two lines in the output are precisely the contrast of marginal effects for the three categories of father's education (second versus first, third versus first). It's what I want; yet, I do not know how to store them for each country, and how to retrieve them afterwards.

    HTML Code:
    capture noisily repest PIAAC, estimate(stata: mylogitmargins, logit(univ i.edufath 
    > female age if cntryid3_group==2 & egresados==1) margins(r.edufath))
    (note: file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp not found)
    file C:\Users\LOrti\AppData\Local\Temp\ST_00000005.tmp saved
    
    _pooled.
     : _pooled
    ----------------------------------------------------------------------------------
                     |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -----------------+----------------------------------------------------------------
    univ_1b_edufat~r |          0  (omitted)
    univ_2_edufather |   1.384978   .1785063     7.76   0.000     1.035112    1.734844
    univ_3_edufather |   2.719534   .2327599    11.68   0.000     2.263333    3.175735
         univ_female |   .0569396   .1659537     0.34   0.732    -.2683237     .382203
            univ_age |  -.0004378   .0151277    -0.03   0.977    -.0300876    .0292121
          univ__cons |  -2.912376   .5037946    -5.78   0.000    -3.899795   -1.924956
    margins_r2vs1_~r |   .1281268   .0182265     7.03   0.000     .0924035    .1638501
    margins_r3vs1_~r |   .4029726   .0477158     8.45   0.000     .3094514    .4964938
    ----------------------------------------------------------------------------------

    Could you help me with this?

    Thanks for your attention

    Luis Ortiz

    PD: In case it could be of any use, I include a sample of my data, extracted from my dataset using dataex:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double edufather float(age female univ cntryid3_group)
    1 99 1 1 1
    2 99 0 1 1
    3 99 1 1 1
    1 99 1 0 1
    1 99 1 0 1
    2 99 0 0 1
    3 99 0 1 1
    3 99 0 0 1
    3 99 1 0 1
    1 99 1 0 1
    3 99 0 0 1
    1 99 1 0 1
    2 99 1 0 1
    3 99 1 1 1
    3 99 1 1 1
    2 99 0 0 1
    1 99 0 0 1
    3 99 1 1 1
    2 99 1 0 1
    2 99 0 1 1
    3 99 0 0 1
    1 99 0 0 1
    3 99 1 1 1
    3 99 1 0 1
    1 99 1 0 1
    2 99 0 0 1
    2 99 1 1 1
    1 99 0 0 1
    . 99 0 0 1
    1 99 0 0 1
    1 99 1 1 1
    1 99 1 0 1
    1 99 0 0 1
    2 99 1 0 1
    1 99 0 0 1
    2 99 0 1 1
    3 99 1 0 1
    3 99 0 0 1
    2 99 0 0 1
    1 99 1 0 1
    . 99 0 0 1
    2 99 0 0 1
    3 99 0 0 1
    1 99 1 0 1
    2 99 0 1 1
    3 99 0 1 1
    3 99 0 1 1
    3 99 1 1 1
    3 99 0 0 1
    2 99 0 0 1
    3 99 1 0 1
    1 99 0 0 1
    2 99 0 0 1
    2 99 1 0 1
    1 99 0 0 1
    . 99 0 0 1
    . 99 0 0 1
    2 99 0 0 1
    1 99 1 0 1
    1 99 1 1 1
    1 99 0 0 1
    1 99 1 0 1
    2 99 0 0 1
    2 99 1 0 1
    1 99 0 0 1
    1 99 1 0 1
    3 99 0 1 1
    1 99 0 0 1
    2 99 1 0 1
    1 99 0 1 1
    2 99 0 1 1
    1 99 0 0 1
    1 99 0 0 1
    2 99 1 0 1
    1 99 1 0 1
    2 99 1 0 1
    1 99 0 0 1
    2 99 0 1 1
    3 99 0 0 1
    3 99 1 0 1
    3 99 1 0 1
    1 99 0 0 1
    3 99 1 1 1
    3 99 0 1 1
    3 99 0 0 1
    2 99 0 0 1
    3 99 0 0 1
    2 99 0 0 1
    2 99 1 0 1
    1 99 0 0 1
    3 99 1 0 1
    1 99 0 0 1
    2 99 1 1 1
    3 99 1 0 1
    2 99 0 0 1
    . 99 0 0 1
    1 99 0 0 1
    1 99 0 0 1
    1 99 0 0 1
    2 99 0 0 1
    end
    label values edufather edu_fat
    label def edu_fat 1 "ISCED 1/2/3sh", modify
    label def edu_fat 2 "ISCED 3/4", modify
    label def edu_fat 3 "ISCED 5/6", modify
    label values female gndr
    label def gndr 0 "Male", modify
    label def gndr 1 "Female", modify
    label values univ univ_lab
    label def univ_lab 0 "No uni", modify
    label def univ_lab 1 "Univ", modify
    label values cntryid3_group cntryid3_group
    label def cntryid3_group 1 "124. Canada", modify

  • #2
    There are a bit too many steps to this problem for me to actually try to replicate it, however I'm not entirely sure where the problem arises?

    There are three steps to do what you want
    1) Get the actual estimates (e.g. doing the estimates)
    2) Make those estimates available to Stata (e.g. as a variable, matrix, local, ...)
    3) Storing the estimates in an outside file

    Which of the steps are you stuck on?

    Comment


    • #3
      Thanks for your reply, Jesse

      I believe I'm stuck on the step 2 and 3.

      If you see the last table in my initial post, you'll notice (in the last two lines) that the marginal effects are there, at the bottom of the table are there: margins_r2vs1_~r & margins_r3vs1_~r . I do not know well why they appear there, listed immediately after the covariates.

      Yet, I do not know how to store the marginal effects in a file that can be later retrieved as a Stata file.

      Thanks for your attention again

      Best

      Luis

      Comment


      • #4
        It is often helpful to get something running properly before you start wrapping it in a bunch of programs and loops.

        If after margin statement you issue the statement margins,coefl, you will see how to refer to those margins. You can write those into a matrix anywhere you want. Note that if you're doing this in a loop you're going to have to either have a counter or something to make sure that each iteration has a different name for the matrix or set up to add rows or whatever to the matrix each time.

        Comment

        Working...
        X