Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimating marginal effects of a constructed variable

    Hi Statalist community

    I want to run a probit regression and estimate marginal effects of two variables that do not directly enter into the regression


    Here is a simplified version of my problem:


    I want to calculate the marginal effects of continuous variables pay and length on a binary dependent variable taskaccept (which takes values 0 or 1)


    However, taskaccept is a function of ln(pay)/ln(length), such that


    \[
    taskaccept = a + b * (ln(pay)/ln(length))
    \]


    I created a variable


    \[
    lnpl = (ln(pay)/ln(length))
    \]


    And then ran the regression

    Code:
    probit taskaccept lnpl

    What can i do now to estimate separate marginal effects with respect to pay and length?


    I cannot run:

    Code:
    margins, dydx(pay length)
    because pay and length are not covariates.

    I can use the chain rule to calculate d(taskaccept)/d(pay) and d(taskaccept)/d(length) using _b[lnpl] however this restricts both marginal effects of pay and length to have the same standard errors etc., while the true relationship between taskaccept and pay and length is such that they would have separate marginal effects.

    Is there a way for me to run a regression while "reminding" Stata that one of the independent variables is actually a composite of other variables? Or is there some other way for me to approach this problem?


    I’ll really appreciate your help. I’m a rookie to this forum, if that wasn’t completely clear from the question.
    Last edited by Saika Belal; 27 Jul 2023, 21:11. Reason: Added tags

  • #2
    Can you use \(\ln(\frac{pay}{length})\) instead of \(\frac{\ln(pay)}{\ln(length)}\)? In that case the equation simplifies to \(\beta \ln(\frac{pay}{length}) = \beta (\ln(pay) -\ln(length)) = \beta \ln(pay) - \beta\ln(length)\), that means you just add both \(\ln(pay)\) and \(\ln(length)\) and constrain the coefficient of \(\ln(length)\) to be minus the coefficient of \(\ln(pay)\), with the constraint option. After that, you can just use margins to get marginal effects and their standard errors.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      You might try either gmm or nl. Here's an example with the auto data,
      Code:
      cap preserve
      cap drop _all
      
      sysuse auto
      
      nl (price={alpha}+{beta}*log(weight)/log(length)), var(weight length)
      
      margins, dydx(*) gen(m)
      
      sum m1 m2
      
      cap restore
      Result:
      Code:
      . nl (price={alpha}+{beta}*log(weight)/log(length)), var(weight length)
      
      Iteration 0:  residual SS =  4.83e+08
      Iteration 1:  residual SS =  4.83e+08
      
      
            Source |      SS            df       MS
      -------------+----------------------------------    Number of obs =         74
             Model |  1.525e+08          1   152478134    R-squared     =     0.2401
          Residual |  4.826e+08         72  6702600.87    Adj R-squared =     0.2295
      -------------+----------------------------------    Root MSE      =   2588.938
             Total |  6.351e+08         73  8699525.97    Res. dev.     =   1371.108
      
      ------------------------------------------------------------------------------
             price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
      -------------+----------------------------------------------------------------
            /alpha |  -99295.53   22113.07    -4.49   0.000    -143377.1   -55213.92
             /beta |   69129.61   14493.79     4.77   0.000     40236.77    98022.46
      ------------------------------------------------------------------------------
      Note: Parameter alpha is used as a constant term during estimation.
      
      .
      . margins, dydx(*) gen(m)
      
      Average marginal effects                                    Number of obs = 74
      Model VCE: GNR
      
      Expression: Fitted values, predict()
      dy/dx wrt:  weight length
      
      ------------------------------------------------------------------------------
                   |            Delta-method
                   |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
            weight |   4.723883   .9904146     4.77   0.000     2.782706     6.66506
            length |  -109.0759   22.86898    -4.77   0.000    -153.8983   -64.25354
      ------------------------------------------------------------------------------
      
      .
      . sum m1 m2
      
          Variable |        Obs        Mean    Std. dev.       Min        Max
      -------------+---------------------------------------------------------
                m1 |         74    4.723883    1.398918   2.620231   7.849442
                m2 |         74   -109.0759    14.93345  -148.9029  -84.71975
      
      .

      Comment


      • #4

        Thank you both very much for your help. I began by looking into the nl option that John (@John Mullahy) suggested,
        and I have a few follow up questions.

        Ultimately, I want to estimate marginal effects of pay and length at different levels of factor variables akin to running:
        Code:
        probit taskaccept mainVAR##i.factor1##i.factor2
        margins i.factor1##i.factor2 , dydx(pay length)
        where mainVAR = (ln(pay)/ln(length))

        I expanded John's code to include interactions with one factor variable: "female" which takes values 0 and 1.
        Code:
        nl (taskaccept={a}+{b1}*log(pay)/log(length) + {b2}*female + {b3}*female*(log(pay)/log(length)) ) ///
        if length>1 , var(pay length female) coeflegend
        This part runs fine. (I use the condition "if length>1" to avoid log(pay)/log(length) equating to missing.)
        I would then like to run:
        Code:
        margins i.female, dydx(taskksh tasksecs) gen(m)
        But I get the error:
        HTML Code:
        factor 'female' not found in list of covariates
        r(322);
        Do you have any suggestions for this?

        Additionally, if I want to add a set of covariate controls to my equation, will I need to extend the equation by adding a parameter for each control variable (see below)?
        Code:
        nl (taskaccept={a}+{b1}*log(pay)/log(length) + {b2}*female + {b3}*female*(log(pay)/log(length)) ///
        + {b4}*control1 + {b5}*control2 + {b6}*control3 + ... ) ///
        if length>1 , var(pay length female) coeflegend
        Last edited by Saika Belal; 02 Aug 2023, 14:07. Reason: added @John Mullahy

        Comment


        • #5
          Hi Saika,
          you could use "f_able" (see how it works here https://journals.sagepub.com/doi/pdf...6867X211000005)

          here a small example

          Code:
          ssc install f_able
          ssc install frause
          frause oaxaca, clear
          ** Important, you need to use fgen to create the variables (so the "construction info" is left behind)
          fgen age_educ= age/educ
          
          ** add the original variables. If not meant to be included in your model add them with "o."
          reg lnwage age_educ o.age o.educ
          
          ** need to include the name of all "constructed variables"
          f_able age_educ, auto
          
          ** because of numerical derivatives you may need "noestimcheck"
          margins, dydx(age) atmeans noestimcheck
          margins, dydx(educ) atmeans noestimcheck
          
          margins, atmeans expression(_b[age_educ]/educ)
          margins, atmeans expression(-_b[age_educ]*age/educ^2)
          Note that you will need to include the original variables with "o.", and possibly add -noestimcheck-
          Im also providing you with the "manual" version by deriving the exact marginal effect analytically

          You can also use this with other commands (like probit/logit). Although it will take a bit extra time because of the numerical derivatives

          HTH
          F

          Just for fun, John Mullahy example

          Code:
          sysuse auto, clear
          fgen lw_ll = log(weight)/log(length)
          reg price lw_ll o.weight o.length
          f_able lw_ll, auto
          margins, dydx(*) noestimcheck
          Last edited by FernandoRios; 02 Aug 2023, 13:41.

          Comment


          • #6
            Originally posted by FernandoRios View Post
            here a small example

            Code:
            ssc install f_able
            ssc install frause
            frause oaxaca, clear
            ** Important, you need to use fgen to create the variables (so the "construction info" is left behind)
            fgen age_educ= age/educ
            
            ** add the original variables. If not meant to be included in your model add them with "o."
            reg lnwage age_educ o.age o.educ
            
            ** need to include the name of all "constructed variables"
            f_able age_educ, auto
            
            ** because of numerical derivatives you may need "noestimcheck"
            margins, dydx(age) atmeans noestimcheck
            margins, dydx(educ) atmeans noestimcheck
            
            margins, atmeans expression(_b[age_educ]/educ)
            margins, atmeans expression(-_b[age_educ]*age/educ^2)
            Note that you will need to include the original variables with "o.", and possibly add -noestimcheck-
            Im also providing you with the "manual" version by deriving the exact marginal effect analytically

            You can also use this with other commands (like probit/logit). Although it will take a bit extra time because of the numerical derivatives

            Hi Fernando ( @FernandoRios ),

            Thank you for the very helpful suggestion. f_able is pretty much exactly what I was looking for. I also see that you wrote the package, thank you, it was much needed.

            I’ve been partially able to use your example and adjust it for my needs, with two issues. I have found a workaround for one of them but the other I cannot solve. Would love your suggestions.
            I am using Stata MP 14.2.

            Here is a simplified version of my code:
            Code:
            cap drop lnpl
            fgen lnpl = ln(pay)/ln(length)
             
            probit taskaccept c.lnpl##i.female o.pay o.length ///
            ,  nolog vce(cluster phonenumber)
             
            cap drop __lnpl // workaround to address error 1
            f_able , nlvar(lnpl)
            margins i.female , dydx(pay length) nochainrule noestimcheck numerical 
            // error 2 pops up here
            di "end of code" // this line does not run
            Here are the errors I get:

            Error 1 (Found a workaround):
            variable __lnpl already defined
            r(110);

            This error shows up before running the line with the margins command.
            I address this by inserting code (cap drop __lnpl). It’s not ideal, but it works.

            Error 2 (Did not find a solution):
            something required
            r(100);


            This second error shows up after running the margins command. It stops the rest of the do-file from running. I have tried out your code with the frause Oaxaca dataset, and the same issue arises. Stata runs this line (margins, dydx(age) atmeans noestimcheck) and then shows the error (r(100) with the same error message). Stata does not run the remaining lines of code (i.e., margins, dydx(educ) atmeans noestimcheck, etc.)

            I am guessing it is not a coding error since the same error shows up with your example code. Is this a Stata version issue? Can you please suggest a workaround? Is there a way that I can override the error since it does not seem to stop it from running the line of code it is referring to?

            Thanks for your help.
            Last edited by Saika Belal; 03 Aug 2023, 14:47.

            Comment


            • #7
              I may have found a solution/workaround. Adding the option postafter margins.

              Code:
              cap drop lnpl
              fgen lnpl = ln(pay)/ln(length)
              probit taskaccept c.lnpl##i.female o.pay o.length ///
              , nolog vce(cluster phonenumber)
              cap drop __lnpl // workaround to address error 1
              f_able , nlvar(lnpl)
              margins i.female , dydx(pay length) nochainrule noestimcheck numerical post // added option post
              di "end of code" // now this line of code does run!

              Comment


              • #8
                Are you using the version from Stata journal or from ssc?
                also the syntax I used in my examples was an upgrade
                trynusing that ine

                Comment


                • #9
                  Originally posted by FernandoRios View Post
                  Are you using the version from Stata journal or from ssc?
                  also the syntax I used in my examples was an upgrade
                  trynusing that ine
                  I am using it from ssc (ssc install f_able). Which part of your code was an upgrade? That way I will know what to adjust.

                  Comment


                  • #10
                    ok I found the error.
                    It has something to do with some changes Stata 14 had with margins.

                    So please, open file f_able_epilog.ado
                    and modify its content with this:

                    "
                    program f_able_epilog
                    syntax [anything]
                    if missing("`anything'") local anything `e(nldepvar)'
                    foreach i of local anything {
                    qui:replace `i'=__`i'
                    qui:drop __`i'
                    }
                    end

                    That should fix the problem.


                    Comment

                    Working...
                    X