Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins expression

    For a panel fe regression where the IV and DV have both been log-transformed, I would like to know how I can use the margins command to identify the marginal effect of the original (non-log transformed) IV on the original (non-log transformed) DV. To give a very simple example, lets take the IV and DV to be x and y. The regression function is:

    xtreg ln_y ln_x, fe

    margins, ln_x will give the marginal impact of ln_x on ln_y. However, I would like to find the marginal impact of x on y. Is there anyway to use the "expression" option with margins to obtain this? Thank you.

  • #2
    No, this can't be done.

    First, there is a conceptual issue. If the relationship between ln_y and ln_x is linear, so that the -xtreg- model is correctly specified, then the relationship between Y and X is necessarily non-linear. This implies that there is no such thing as the marginal effect of x on y. Rather there will be infinitely many such marginal effects, and they will depend on your starting values of x and y.

    That said, if you have specific values of x and y in mind as starting values, you can dust off your memories of calculus and use the chain rule. You are looking to calculate dy/dx. Let's use the notation u = ln(x) and v = ln(y). What you have is dv/du. Well, the chain rule tells us that
    Code:
    dydx = dy/dv dv/du du/dx
    Now, dv/du you can get directly from your -xtreg- output. (Well, maybe: if you are referring to the same modeling you had in an earlier thread where your ln_x was actually entered quadratically in the model, then you have to first get dv/du from the -margins- command at the values of u that correspond to the values of x you are interested in.)

    Now, what about dy/dv, and du/dx. Well u = ln(x), so du/dx = 1/x. And v = ln(y), so y = exp(v), so dy/dv = exp(v) = y. Putting this all together:
    Code:
    dydx = y * dv/du * 1/x = (y/x)*dv/du
    So, given a series of interesting values of x, you have to first get the corresponding values of ln(x). Then you can run -margins, dydx(ln_x) at(ln_x = (whatever))- This will give you the values of dv/du that correspond to those values of ln_x. (I'm assuming here we're dealing with the model containing a quadratic term in ln_x. If ln_x appears only linearly in your -xtreg- command then you can just use the coefficient of ln_x from the -xtreg- output.) Next, you have to get the values of y that correspond to x. So for that you would run -margins, expression(exp(predict (xb))) at(ln_x = (the same whatever as before)- to get the corresponding values of y. Then calculate (y/x)*dv/du for each of those and you have your marginal effects.

    It's a bit of a coding exercise as you will need to loop over those values of x you choose. And there is the issue of quantifying uncertainty around these, which I don't actually know how to do. The problem is that in addition to the dv/du values having uncertainty, so does the value of y, and I don't know how you can combine these to come up with a correct estimate of the uncertainty in the final result.

    Comment


    • #3
      By the way, going the other way is quite easy. If you have a relationship -xtreg y x- and you want the marginal effect of ln(x) on ln(y), a.k.a. the elasticity, that is built into -margins-:
      Code:
      margins, eyex(x)
      Again, in principle, this should be done -at()- particular values of x, the above command giving an "average" elasticity. But it's just very easy to do.

      Comment


      • #4
        Thank you Clyde. Your response (#2) is very helpful.

        The question in this post did in fact pertain to the model that I had in the earlier thread (and, yes, it did have a quadratic term for lnx.) Based on your suggestion above (#2), I was able to work out the dydx's for a few sample values of x - the calculus chain rule explanation was very clear (thanks!) The marginal effects (ie dydx's) are not trivial which is interesting for my analysis.

        I am however thinking about your comment below
        Code:
        And there is the issue of quantifying uncertainty around these, which I don't actually know how to do. The problem is that in addition to the dv/du values having uncertainty, so does the value of y, and I don't know how you can combine these to come up with a correct estimate of the uncertainty in the final result.
        By "uncertainty" do you mean the impact of the error terms in the original regression function (i.e. xtreg ln_y ln_x, fe) and the error terms related to dv/du?

        If my understanding is correct, I presume this "uncertainty" would occur whenever there is a log-log transformation of the IV and DV (i.e. it is not unique to my model.) That leads me to wonder whether such transformations which are done to correct for data skewness in fact add a new level of complexity to the conclusions. (This is just an observation from my side.)

        Comment


        • #5
          By "uncertainty" do you mean the impact of the error terms in the original regression function (i.e. xtreg ln_y ln_x, fe) and the error terms related to dv/du?
          I was referring specifically to the fact that regression and margins outputs come with standard errors (from which one calculates confidence intervals and test statistics) and these quantify the range of variation that might be seen with repeated sampling.

          If my understanding is correct, I presume this "uncertainty" would occur whenever there is a log-log transformation of the IV and DV (i.e. it is not unique to my model.)
          That is correct. The -eyex()- in -margins- does this calculation when you are going the other way (i.e. from an x-y regression to a ln_x-ln_y marginal effect). And other post-estimation commands like -lincom- and -nlcom- calculate standard errors for certain expressions calculated from regression coefficients. But I don't know how to calculate the standard errors for an x-y marginal effect when the original regresion was ln_x-ln_y.

          That leads me to wonder whether such transformations which are done to correct for data skewness in fact add a new level of complexity to the conclusions.
          Yes, they do.

          Comment


          • #6
            Thanks for your clarifications, Clyde. They are very helpful as I think through the analysis.

            Comment


            • #7
              Hi Clyde,

              I have a basic question (- not necessarily stata-related) pertaining to post # 2 above. I am summarizing the background for the question in points 1-3 (for new readers) and my question is in point 4:
              1. I have log-transformed my original variables x and y as follows: u = ln(x) and v = ln(y)
              2. I have then created a linear model between u and v: xtreg v u##u, fe (Please note that u exists as in quadratic form as well.)
              3. I have computed the marginal effect of u on v (i.e. dv/du) at different values of u by using the stata command -margins-.. I find that dv/du is significant at all values computed.
              4. Given that u and v are functions of x and y respectively, my view is that dy/dx would be significant at the corresponding values of x as well.
              I am not sure if this intuition is correct or not. In case anyone on this forum can clarify, I would be grateful. (I have a basic familiarity with statistics but do not have a very high level of expertise. Hence if my question is too basic or does not belong to this stata forum, please ignore it. Thank you.




              Comment


              • #8
                That intuition is not correct. It may be correct as applied to your particular data, but it is not in general true. First, I'm not going to talk about it in terms of statistical significance, as it just adds yet another layer of transformations on top of an already complicated problem and it is also a concept that I do not consider valid in general, and especially in this context. Let's just talk about small and large effects.

                The problem is non-linearity of the u->x and v-> y relationships. This manifests itself in the relationship dy/dx = (y/x) dv/du, as derived in #2. Now, dv/du can be a big number, but if the corresponding values of x and y are such that the y/x ratio is close to zero (e.g. if x is very large or y is very small) that y/x factor can smash dy/dx down to next to nothing. In a purely linear v:u relationship with a positive dv/du, we might not have to worry about that happening as large x would generally be associated to large y and y/x would not vary very much outside of some region near x = 0, which is probably out of the range of realistic data values. But here we have a quadratic relationship between v and u which means that the combination of y small when x is large is, in principle, possible.

                This discussion leads me to think about some questions about your use of a quadratic model. You state that all of your dv/du values are "significant." In principle, that is impossible in a quadratic model. Every quadratic relationship has a vertex, a turning point, and at that turning point dv/du = 0, and in some region around that point, it will be "not significant." So the question becomes, where is your turning point? Perhaps it is outside the range of observed values of u (x), so you see only large values of dv/du in your data. Then, in that case, why use a quadratic model in the first place? If there is no turning point within striking distance of the observed data, then at most you are capturing some modest curvilinearity in the v:u relationship. Is it curvy enough to warrant the additional complexity? Or is it just basically a line that veers slightly off course as you approach one end of the data? This has implications for the question you pose in #7, because if you have a linear v:u relationiship with positive slope, then, as indicated in the previous paragraph, the problem of (y/x) squashing dv/du close to zero will arise only in a limited range of values of u (x), and perhaps that limited range is also outside the observed range of the data.

                Comment


                • #9
                  Thank you Clyde for your response. I can see the problem you are describing.

                  Also, as you correctly pointed out, the turning point for my inverted-U graph is to the extreme left, and my values of interest lie to the right of the turning point. Hence. the portion of the graph I am interested in is in fact only the downward-sloping portion.

                  While I have managed to calculate dy/dx using the manual process above (calculus chain rule), I do not have any solution for calculating the CI's. To that extent, my results are inconclusive. While doing some internet research on this problem, I came across an article (Stata Tip 128 in The Stata Journal 2017), which highlights this challenge in computing the marginal effect of log-transformed covariates (- and in my case, the IV is also log-transformed - hence I have a more complex situation.)

                  While I do not have a solution, I would like to thank you for helping with your clarifications.

                  Comment


                  • #10
                    If the relationship between ln_y and ln_x is linear, so that the -xtreg- model is correctly specified, then the relationship between Y and X is necessarily non-linear. This implies that there is no such thingas the marginal effect of x on y. Rather there will be infinitely many such marginal effects,
                    Excellent. I believe that statement will help many understand Stata’s concept of margins.

                    I think a similar statement should be highlighted at the start of Stata’s manuals on margins and marginsplot. (If it is, then please bare with my bad memory.) I have the impression that not all using Stata’s margins understand them.

                    Comment

                    Working...
                    X