Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Please help! Quadratic Curves Inverting WRONGLY!

    Dear all,

    I use Stata16 and I humbly seek help. I have 2 cases where the parabolic curves are inverting WRONGLY. I did everything right (or so I think).

    **First Case: I should have an INVERTED U-shaped curve. But the curve keeps showing an upward trend.
    Please see the code I executed and the output.

    Code:
    twoway qfit lnco2 enu
    xtpcse lnco2 enu enusq lnpc popgr renew rq, rhotype(freg) np1
    di -_b[enu]/(2 * _b[enusq])
    local tp = -_b[enu]/(2 * _b[enusq])
    twoway qfit lnco2 enu, xli(`tp')

    Code:
    . twoway qfit lnco2 enu
    
    . xtpcse lnco2 enu enusq lnpc popgr renew rq, rhotype(freg) np1
    (note: the number of observations per panel, e(n_sigma) = 4,
           used to compute the disturbance of covariance matrix e(Sigma)
           is less than half of the average number of observations per panel,
           e(n_avg) = 14.714286; you may want to consider the pairwise option)
    
    Linear regression, correlated panels corrected standard errors (PCSEs)
    
    Group variable:   c_id                          Number of obs     =        103
    Time variable:    year                          Number of groups  =          7
    Panels:           correlated (unbalanced)       Obs per group:
    Autocorrelation:  no autocorrelation                          min =          4
    Sigma computed by casewise selection                          avg =  14.714286
                                                                  max =         19
    Estimated covariances      =        28          R-squared         =     0.8832
    Estimated autocorrelations =         0          Wald chi2(6)      =    3176.03
    Estimated coefficients     =         7          Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
                 |           Panel-corrected
           lnco2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             enu |   .0083711   .0007237    11.57   0.000     .0069527    .0097895
           enusq |  -8.44e-06   1.11e-06    -7.60   0.000    -.0000106   -6.26e-06
            lnpc |   .5700028   .1329954     4.29   0.000     .3093366     .830669
           popgr |   .1243389   .0564481     2.20   0.028     .0137027     .234975
           renew |  -.0256073   .0027666    -9.26   0.000    -.0310297   -.0201848
              rq |  -.4014216   .1457195    -2.75   0.006    -.6870265   -.1158167
           _cons |  -5.346004   1.147205    -4.66   0.000    -7.594484   -3.097524
    ------------------------------------------------------------------------------
    
    . di -_b[enu]/(2 * _b[enusq])
    496.0271
    
    . local tp = -_b[enu]/(2 * _b[enusq])
    
    . twoway qfit lnco2 enu, xli(`tp')
    **Second Case: I should have a U-shaped curve. But what I got is an INVERTED U-shaped curve.
    Please see the code I used:
    Code:
    twoway qfit lnco2 kofgi
    xtpcse lnco2 lnpc popgr renew rq kofgi kofgi2, rhotype(freg) np1
    di -_b[kofgi]/(2 * _b[kofgi2])
    local tp = -_b[kofgi/(2 * _b[kofgi2])
    twoway qfit lnco2 kofgi, xli(`tp')
    The output:
    Code:
    . twoway qfit lnco2 kofgi
    
    . xtpcse lnco2 lnpc popgr renew rq kofgi kofgi2, rhotype(freg) np1
    
    Linear regression, correlated panels corrected standard errors (PCSEs)
    
    Group variable:   c_id                          Number of obs     =        140
    Time variable:    year                          Number of groups  =          7
    Panels:           correlated (balanced)         Obs per group:
    Autocorrelation:  no autocorrelation                          min =         20
                                                                  avg =         20
                                                                  max =         20
    Estimated covariances      =        28          R-squared         =     0.7736
    Estimated autocorrelations =         0          Wald chi2(6)      =    1891.41
    Estimated coefficients     =         7          Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
                 |           Panel-corrected
           lnco2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            lnpc |   .6572758   .0212834    30.88   0.000      .615561    .6989906
           popgr |   .2181154   .0488538     4.46   0.000     .1223638     .313867
           renew |   -.006806   .0013833    -4.92   0.000    -.0095171   -.0040948
              rq |  -.2882291   .0664838    -4.34   0.000    -.4185349   -.1579234
           kofgi |   -.116146   .0327807    -3.54   0.000     -.180395    -.051897
          kofgi2 |   .0014956    .000385     3.88   0.000      .000741    .0022503
           _cons |  -3.270156   .6360434    -5.14   0.000    -4.516778   -2.023533
    ------------------------------------------------------------------------------
    
    . di -_b[kofgi]/(2 * _b[kofgi2])
    38.827971
    
    . local tp = -_b[kofgi]/(2 * _b[kofgi2])
    
    . twoway qfit lnco2 kofgi, xli(`tp')
    I have checked and re-checked the codes and cannot seem to place where the problem lies.
    I will appreciate every assistance.
    Thanks in advance.
    Ngozi

  • #2
    Well, -twoway qfit- fits a quadratic curve using only your dependent variable and the linear and quadratic terms--it is not adjusted for other variables. The -xtpcse- commands include many other variables. There is no reason to think that the results will be the same. The conclusion is that the direct relationship between lnco2 and kofgi or enu is confounded by the other variables in your regressions. Whether the adjusted or unadjusted analyses are more appropriate for your research questions depends on the questions.

    If you want to create plots that reflect the results of the regressions, you cannot use -twoway qfit- for that. You have to re-do the regressions using factor-variable notation to create the quadratic terms, and then have -margins- and -marginsplot- create the graphs. So something like this:

    Code:
    xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1
    local tp = -_b[kofgi]/(2 * _b[kofgi2])
    margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here)
    marginsplot, xline(`tp')
    But again, you need to decide, based on your research questions, whether the adjusted or unadjusted model is the appropriate one. Then use those results and the corresponding graphs.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Well, -twoway qfit- fits a quadratic curve using only your dependent variable and the linear and quadratic terms--it is not adjusted for other variables. The -xtpcse- commands include many other variables. There is no reason to think that the results will be the same. The conclusion is that the direct relationship between lnco2 and kofgi or enu is confounded by the other variables in your regressions. Whether the adjusted or unadjusted analyses are more appropriate for your research questions depends on the questions.

      If you want to create plots that reflect the results of the regressions, you cannot use -twoway qfit- for that. You have to re-do the regressions using factor-variable notation to create the quadratic terms, and then have -margins- and -marginsplot- create the graphs. So something like this:

      Code:
      xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1
      local tp = -_b[kofgi]/(2 * _b[kofgi2])
      margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here)
      marginsplot, xline(`tp')
      But again, you need to decide, based on your research questions, whether the adjusted or unadjusted model is the appropriate one. Then use those results and the corresponding graphs.
      Thanks a lot, Prof. Schechter
      I actually used the same code for a previous project and I got the exact plot.
      See it here;
      Code:
      xtpcse lnco2pc pc pcsq lnenu lndcb popg lnfdini lntr k2 y2-y37, hetonly
      di -_b[pc]/(2 * _b[pcsq])
      local tp = -_b[pc]/(2 * _b[pcsq])
      twoway qfit lnco2pc pc, xli(`tp')
      and the output:
      Code:
      xtpcse lnco2pc pc pcsq lnenu lndcb popg lnfdini lntr k2 y2-y37, hetonly
      
      Number of gaps in sample:  33
      note: y36 omitted because of collinearity
      note: y37 omitted because of collinearity
      
      Linear regression, heteroskedastic panels corrected standard errors
      
      Group variable:   c_id                          Number of obs     =        508
      Time variable:    year                          Number of groups  =         19
      Panels:           heteroskedastic (unbalanced)  Obs per group:
      Autocorrelation:  no autocorrelation                          min =          8
                                                                    avg =  26.736842
                                                                    max =         35
      Estimated covariances      =        19          R-squared         =     0.8659
      Estimated autocorrelations =         0          Wald chi2(42)     =    4214.72
      Estimated coefficients     =        43          Prob > chi2       =     0.0000
      
      ------------------------------------------------------------------------------
                   |            Het-corrected
           lnco2pc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
                pc |   .0005865   .0000398    14.73   0.000     .0005085    .0006646
              pcsq |  -3.19e-08   2.88e-09   -11.08   0.000    -3.76e-08   -2.63e-08
             lnenu |   .4366533   .0678342     6.44   0.000     .3037008    .5696058
             lndcb |   .3442699   .0415652     8.28   0.000     .2628036    .4257363
              popg |  -.0822604   .0384713    -2.14   0.032    -.1576627   -.0068581
           lnfdini |   .0484114   .0157768     3.07   0.002     .0174894    .0793334
              lntr |  -.1261522    .056504    -2.23   0.026     -.236898   -.0154064
                k2 |  -.0288462    .061343    -0.47   0.638    -.1490763    .0913838
                y2 |   .0617816   .1782102     0.35   0.729    -.2875039    .4110671
                y3 |   .1937459    .175764     1.10   0.270    -.1507452    .5382369
                y4 |   .0635031   .1778505     0.36   0.721    -.2850775    .4120837
                y5 |   .1245039   .1829247     0.68   0.496     -.234022    .4830297
                y6 |   .2153634    .185913     1.16   0.247    -.1490193    .5797462
                y7 |   .0394497   .1933122     0.20   0.838    -.3394353    .4183347
                y33 |  -.4327306   .1913891    -2.26   0.024    -.8078465   -.0576148
               y34 |  -.3997159   .1883905    -2.12   0.034    -.7689544   -.0304774
               y35 |  -.3514974   .1903157    -1.85   0.065    -.7245094    .0215146
               y36 |          0  (omitted)
               y37 |          0  (omitted)
             _cons |  -5.327906   .5717428    -9.32   0.000    -6.448501   -4.207311
      ------------------------------------------------------------------------------
      
      . di -_b[pc]/(2 * _b[pcsq])
      9188.5859
      
      . local tp = -_b[pc]/(2 * _b[pcsq])
      
      . twoway qfit lnco2pc pc, xli(`tp')
      So, I'm baffled why it isn't producing similar plot this time around.

      Also, I tried your code and got this error message:
      Code:
      xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1
      
      Linear regression, correlated panels corrected standard errors (PCSEs)
      
      Group variable:   c_id                          Number of obs     =        140
      Time variable:    year                          Number of groups  =          7
      Panels:           correlated (balanced)         Obs per group:
      Autocorrelation:  no autocorrelation                          min =         20
                                                                    avg =         20
                                                                    max =         20
      Estimated covariances      =        28          R-squared         =     0.7736
      Estimated autocorrelations =         0          Wald chi2(6)      =    1891.41
      Estimated coefficients     =         7          Prob > chi2       =     0.0000
      
      ---------------------------------------------------------------------------------
                      |           Panel-corrected
                lnco2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ----------------+----------------------------------------------------------------
                 lnpc |   .6572758   .0212834    30.88   0.000      .615561    .6989906
                popgr |   .2181154   .0488538     4.46   0.000     .1223638     .313867
                renew |  -.0068059   .0013833    -4.92   0.000    -.0095171   -.0040948
                   rq |  -.2882291   .0664838    -4.34   0.000    -.4185348   -.1579233
                kofgi |   -.116146   .0327807    -3.54   0.000    -.1803949    -.051897
                      |
      c.kofgi#c.kofgi |   .0014956    .000385     3.88   0.000      .000741    .0022503
                      |
                _cons |  -3.270157   .6360433    -5.14   0.000    -4.516779   -2.023535
      ---------------------------------------------------------------------------------
      
      . local tp = -_b[kofgi]/(2 * _b[kofgi2])
      [kofgi2] not found
      r(111);
      For the relevant values, do you imply the minimum and maximum values?
      Code:
      margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here)
      How will I input them in the syntax?
      Thanks for the guide, appreciated.
      Ngozi

      Comment


      • #4
        What you show in #3 is a different analysis using different variables. When you take a simple model, regressing Y on X and X^2 and add additional variables, anything can happen. The results may change drastically, or not change much at all. It depends on how correlated the added variables are with X and Y. In the example you report in #3, apparently things didn't change much. You were "lucky," in the sense that you were not surprised. But your surprise in what you show in #1 is due entirely to your unwarranted expectation that you would be "lucky" again in these analyses.

        For the relevant values, do you imply the minimum and maximum values?
        Code:

        margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here)
        How will I input them in the syntax?
        The minimum and maximum values might or might not be relevant. They would, in any case, not be sufficient. By relevant I mean values of kofgi that are normally encountered in real world situations. The minimum and maximum in your sample might fit that description--or they might be unusual outliers that people might not care much about. This judgment is not a statistical one--it is a pragmatic one based on knowledge of these variables' real-world meaning and implications. So I can't help you pick them. If you are not confident of your own judgment in the subject matter area, consult a colleague who has experience. With that said, you want a large enough number of values to reasonably fill in the graph. If you used just a low and high value, then the graph -marginsplot- draws would just be a straight line connecting those points.

        To illustrate the syntax, suppose that a relevant, interesting range of values is from 5,000 to 15,000. Then the syntax in -margins- would look like this:
        Code:
        margins, at(kofgi = (5000(1000)15000))
        That way -margins- will calculate the expected value of lnco2 at each value of kofgi from 5000 through 15,000 in increments of 1000. That will provide 11 different values of kofgi, which is enough to convey the shape of the curve. By the way, note the parentheses in this code--the code I showed in #2 was incorrect and had unbalanced parentheses.

        I notice also that your calculation of the turning point failed in the final regression in #4. That's because you tried to reference _b[kofgi2], but in your factor variable model there is no such variable. You have to use the factor variable notation in -nlcom- as well. So that should be:
        Code:
        local tp = -_b[kofgi]/(2*_b[c.kofgi#c.kofgi])

        Comment


        • #5
          Dear Prof. Schechter,

          I have come to say THANK YOU!!!

          These codes produced EXACTLY what I needed:

          Code:
          xtpcse lnco2 lnpc popgr renew rq c.enu##c.enu, rhotype(freg) np1
          local tp = -_b[enu]/(2 * _b[c.enu#c.enu])
          margins, at(enu = (100(50)1000))
          marginsplot, xline(`tp')
          and:

          Code:
          xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1
          local tp = -_b[kofgi]/(2*_b[c.kofgi#c.kofgi])
          margins, at(kofgi = (10(10)100))
          marginsplot, xline(`tp')
          May you LIVE long, Prof!
          Gracias, Ngozi

          Comment


          • #6
            Hello,

            I am doing my bachelor's thesis and I have encountered the same problem in finding the turning point of my EKC. I employed trade openness and trade openness squared as my main explanatory variables against CO2 emission as the measure of environmental quality. My results showed that TO is negative while the TO2 denotes a positive sign, As I have learned from various literature, based on those signs of the coefficients, I can imply that my model portrays a U-shaped curve. This is where the difficulty enters. I want to generate a U-shaped curve in Stata and I do not know the appropriate commands to use. I've tried the commands suggested in this thread, but it shows an inverted U-shaped curve.

            I would appreciate all of your insights/responses regarding this. Thanks in advance!

            Regards,
            Justine

            Comment


            • #7
              It is unlikely that anybody will be able to help you without seeing the actual commands you tried and the results you got from Stata. Example data, using the -dataex- command would likely be needed as well. Please post back with those. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

              When asking for help with code, always show example data. When showing example data, always use -dataex-.

              Comment


              • #8
                Justine Borja You are running the same question in two threads. I will post in the other one suggesting arbitrarily that people come here. But in general posting the same question several times doesn't multiply your chances of a good answer, just the scope for a fragmented and repetitive response. One thread at a time please.
                Last edited by Nick Cox; 04 Nov 2021, 11:21.

                Comment


                • #9
                  Hello Sir Clyde Schechter. I am testing for the validity of Environment Kuznets Curve in my paper using the ARDL Model. I treated trade openness and trade openness squared as my main variables in relation to CO2 emissions. I am using a time-series data. My results depict the following:

                  Code:
                    ardl lnco2cb_pc lnto lnto2 lnrgdp_pc lnenergy, lags(1 2 2 1 0) ec
                  Code:
                   These are the short-run and long-run parameters:
                  
                  ARDL(1,2,2,1,0) regression
                  
                  Sample:     1992 -     2019                     Number of obs     =         28
                                                                  R-squared         =     0.8664
                                                                  Adj R-squared     =     0.7879
                  Log likelihood =  64.159115                     Root MSE          =     0.0314
                  
                  ------------------------------------------------------------------------------
                  D.lnco2cb_pc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                  -------------+----------------------------------------------------------------
                  ADJ          |
                    lnco2cb_pc |
                           L1. |  -1.033295   .1702565    -6.07   0.000    -1.392505   -.6740853
                  -------------+----------------------------------------------------------------
                  LR           |
                          lnto |  -.1261742   .0676143    -1.87   0.079    -.2688279    .0164794
                         lnto2 |   .0730523   .0372009     1.96   0.066    -.0054347    .1515393
                     lnrgdp_pc |   .3476238   .0481941     7.21   0.000     .2459431    .4493045
                      lnenergy |   .5848107   .1429328     4.09   0.001     .2832488    .8863725
                  -------------+----------------------------------------------------------------
                  SR           |
                          lnto |
                           D1. |   .2737837   .1169716     2.34   0.032     .0269952    .5205722
                           LD. |   .2597734   .1301159     2.00   0.062    -.0147471    .5342939
                               |
                         lnto2 |
                           D1. |   .3707275   .1138614     3.26   0.005     .1305009     .610954
                           LD. |   .3233343   .1237917     2.61   0.018     .0621566     .584512
                               |
                     lnrgdp_pc |
                           D1. |   .6200075   .3585193     1.73   0.102    -.1364022    1.376417
                               |
                         _cons |  -3.058857   .9130884    -3.35   0.004    -4.985305   -1.132409
                  ------------------------------------------------------------------------------

                  Note that I am only interested on the coefficients of the long-run parameters, lnto and lnto2. The signs suggest that there exist a U-shaped curve. I have calculated the turning point and it is found to be at .86358814 (min - . -1.695541, max - .0979904). I want to visually represent this U-shaped curve to solidify my case.

                  Thanks,
                  Justine
                  Last edited by Justine Borja; 05 Nov 2021, 10:48.

                  Comment


                  • #10
                    -ardl- is a user-written command that I am unfamiliar with. From its description at SSC I see it was written for version 11.2. I do not know if it supports factor variable notation. If it does, then your first step is to re-do the regression properly with factor variable notation:

                    Code:
                    ardl lnco2cb_pc c.lnto##c.lnto lnrgdp_pc lnenergy, lags(1 2 2 1 0) ec
                    margins, at(lnto = some_list_of_realistic_values_of_lnto)
                    marginsplot
                    The list of reaslitic values of lnto should consist of about a dozen values that more or less equally span the range of values of lnto that are commonly observed in real life. If you want to highlight the turning point, then that list should include the value 0.86358814. (If 0.86358814 is not a realistic value for lnto, then your model has suggested that you do not in fact have a quadratic relationship, but rather a somewhat curvilnear relationship, but not one that turns around in real life.)

                    If -ardl- does not support factor analysis or is not supported by -margins-, you would have to write code that, in effect, emulates what -margins- would do to calculated expected values conditional on the values of lnto. As I don't know what -ardl- actually does, I can't help you with that.

                    Comment


                    • #11
                      Unfortunately, the ARDL does not allow for factor variable notation in estimation. Do you know any other way to visually represent a U-shaped curve in the time-series? Thanks!

                      Comment


                      • #12
                        Does the -predict- command run after ardl? If so, I can show you how to wrap that in some code that will get you your quadratic curve.

                        If not, somebody who knows what -ardl- does and how to calculate predicted values from its results will have to respond. There are probably Forum members who can do that, but they may not be following this thread. If none of them chimes in here within about 24 hours, start a new thread, and make sure to mention ARDL in the thread title.

                        Comment


                        • #13
                          Good day. I have tried for the predict command after the ARDL but I don't know if this is the -predict- command you are referring to.

                          Code:
                            ardl lnco2cb_pc lnto lnto2 lnrgdp_pc lnenergy, lags(1 2 2 1 0) ec
                          Code:
                            predict lnco2cb_pchat
                          Is that right sir Clyde Schechter


                          Thanks,
                          Justine

                          Comment


                          • #14
                            Yes, that's the one.

                            So, something like this. This code is not tested, so you may encounter errors in it, but this is the gist of the approach.

                            Code:
                            local values -1.7 (0.1) 0.1
                            
                            frame create margin_calculations float(lnto expected_value)
                            
                            clonevar lnto_original = lnto
                            clonevar lnto2_original = lnto2
                            
                            foreach v of numlist `values' {
                                replace lnto = `v'
                                replace lnto2 = `v'*`v'
                                predict pchat
                                summ pchat, meanonly
                                frame post margin_calculations (`v') (r(mean))
                                drop pchat
                            }
                            replace lnto = lnto_original
                            repace lnto2 = lnto2_original
                            
                            frame change margin_calculations
                            sort lnto
                            graph twoway connect expected_value lnto
                            You can modify the numbers in local values to reflect what you think are the realistic and interesting range of values of lnto. I rounded off the range you mentioned in #9 to get these, but you might prefer some other set. You can modify the graph itself to suit your preferences using all of the options available with -graph twoway-.

                            Comment


                            • #15
                              Your help is deeply appreciated! However, i have realized that a negative number is not appropriate for trade openness because there is no such thing as negative index, only 0. Or do I need to consider that it is transformed to logarithm that's why a negative number is present?

                              Comment

                              Working...
                              X