Please help! Quadratic Curves Inverting WRONGLY!

Ngozi ADELEYE

Join Date: Apr 2014
Posts: 80

Please help! Quadratic Curves Inverting WRONGLY!

11 Oct 2021, 14:26

Dear all,

I use Stata16 and I humbly seek help. I have 2 cases where the parabolic curves are inverting WRONGLY. I did everything right (or so I think).

**First Case: I should have an INVERTED U-shaped curve. But the curve keeps showing an upward trend.
Please see the code I executed and the output.

Code:

twoway qfit lnco2 enu
xtpcse lnco2 enu enusq lnpc popgr renew rq, rhotype(freg) np1
di -_b[enu]/(2 * _b[enusq])
local tp = -_b[enu]/(2 * _b[enusq])
twoway qfit lnco2 enu, xli(`tp')

Code:

. twoway qfit lnco2 enu

. xtpcse lnco2 enu enusq lnpc popgr renew rq, rhotype(freg) np1
(note: the number of observations per panel, e(n_sigma) = 4,
       used to compute the disturbance of covariance matrix e(Sigma)
       is less than half of the average number of observations per panel,
       e(n_avg) = 14.714286; you may want to consider the pairwise option)

Linear regression, correlated panels corrected standard errors (PCSEs)

Group variable:   c_id                          Number of obs     =        103
Time variable:    year                          Number of groups  =          7
Panels:           correlated (unbalanced)       Obs per group:
Autocorrelation:  no autocorrelation                          min =          4
Sigma computed by casewise selection                          avg =  14.714286
                                                              max =         19
Estimated covariances      =        28          R-squared         =     0.8832
Estimated autocorrelations =         0          Wald chi2(6)      =    3176.03
Estimated coefficients     =         7          Prob > chi2       =     0.0000

------------------------------------------------------------------------------
             |           Panel-corrected
       lnco2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         enu |   .0083711   .0007237    11.57   0.000     .0069527    .0097895
       enusq |  -8.44e-06   1.11e-06    -7.60   0.000    -.0000106   -6.26e-06
        lnpc |   .5700028   .1329954     4.29   0.000     .3093366     .830669
       popgr |   .1243389   .0564481     2.20   0.028     .0137027     .234975
       renew |  -.0256073   .0027666    -9.26   0.000    -.0310297   -.0201848
          rq |  -.4014216   .1457195    -2.75   0.006    -.6870265   -.1158167
       _cons |  -5.346004   1.147205    -4.66   0.000    -7.594484   -3.097524
------------------------------------------------------------------------------

. di -_b[enu]/(2 * _b[enusq])
496.0271

. local tp = -_b[enu]/(2 * _b[enusq])

. twoway qfit lnco2 enu, xli(`tp')

**Second Case: I should have a U-shaped curve. But what I got is an INVERTED U-shaped curve.
Please see the code I used:

Code:

twoway qfit lnco2 kofgi
xtpcse lnco2 lnpc popgr renew rq kofgi kofgi2, rhotype(freg) np1
di -_b[kofgi]/(2 * _b[kofgi2])
local tp = -_b[kofgi/(2 * _b[kofgi2])
twoway qfit lnco2 kofgi, xli(`tp')

The output:

Code:

. twoway qfit lnco2 kofgi

. xtpcse lnco2 lnpc popgr renew rq kofgi kofgi2, rhotype(freg) np1

Linear regression, correlated panels corrected standard errors (PCSEs)

Group variable:   c_id                          Number of obs     =        140
Time variable:    year                          Number of groups  =          7
Panels:           correlated (balanced)         Obs per group:
Autocorrelation:  no autocorrelation                          min =         20
                                                              avg =         20
                                                              max =         20
Estimated covariances      =        28          R-squared         =     0.7736
Estimated autocorrelations =         0          Wald chi2(6)      =    1891.41
Estimated coefficients     =         7          Prob > chi2       =     0.0000

------------------------------------------------------------------------------
             |           Panel-corrected
       lnco2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        lnpc |   .6572758   .0212834    30.88   0.000      .615561    .6989906
       popgr |   .2181154   .0488538     4.46   0.000     .1223638     .313867
       renew |   -.006806   .0013833    -4.92   0.000    -.0095171   -.0040948
          rq |  -.2882291   .0664838    -4.34   0.000    -.4185349   -.1579234
       kofgi |   -.116146   .0327807    -3.54   0.000     -.180395    -.051897
      kofgi2 |   .0014956    .000385     3.88   0.000      .000741    .0022503
       _cons |  -3.270156   .6360434    -5.14   0.000    -4.516778   -2.023533
------------------------------------------------------------------------------

. di -_b[kofgi]/(2 * _b[kofgi2])
38.827971

. local tp = -_b[kofgi]/(2 * _b[kofgi2])

. twoway qfit lnco2 kofgi, xli(`tp')

I have checked and re-checked the codes and cannot seem to place where the problem lies.
I will appreciate every assistance.
Thanks in advance.
Ngozi

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30188
#2

11 Oct 2021, 14:55

Well, -twoway qfit- fits a quadratic curve using only your dependent variable and the linear and quadratic terms--it is not adjusted for other variables. The -xtpcse- commands include many other variables. There is no reason to think that the results will be the same. The conclusion is that the direct relationship between lnco2 and kofgi or enu is confounded by the other variables in your regressions. Whether the adjusted or unadjusted analyses are more appropriate for your research questions depends on the questions.

If you want to create plots that reflect the results of the regressions, you cannot use -twoway qfit- for that. You have to re-do the regressions using factor-variable notation to create the quadratic terms, and then have -margins- and -marginsplot- create the graphs. So something like this:

Code:

xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1 local tp = -_b[kofgi]/(2 * _b[kofgi2]) margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here) marginsplot, xline(`tp')

But again, you need to decide, based on your research questions, whether the adjusted or unadjusted model is the appropriate one. Then use those results and the corresponding graphs.
Comment

Ngozi ADELEYE

Join Date: Apr 2014
Posts: 80

11 Oct 2021, 15:23

Originally posted by Clyde Schechter View Post

Well, -twoway qfit- fits a quadratic curve using only your dependent variable and the linear and quadratic terms--it is not adjusted for other variables. The -xtpcse- commands include many other variables. There is no reason to think that the results will be the same. The conclusion is that the direct relationship between lnco2 and kofgi or enu is confounded by the other variables in your regressions. Whether the adjusted or unadjusted analyses are more appropriate for your research questions depends on the questions.

If you want to create plots that reflect the results of the regressions, you cannot use -twoway qfit- for that. You have to re-do the regressions using factor-variable notation to create the quadratic terms, and then have -margins- and -marginsplot- create the graphs. So something like this:

Code:

xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1
local tp = -_b[kofgi]/(2 * _b[kofgi2])
margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here)
marginsplot, xline(`tp')

But again, you need to decide, based on your research questions, whether the adjusted or unadjusted model is the appropriate one. Then use those results and the corresponding graphs.

Thanks a lot, Prof. Schechter
I actually used the same code for a previous project and I got the exact plot.
See it here;

Code:

xtpcse lnco2pc pc pcsq lnenu lndcb popg lnfdini lntr k2 y2-y37, hetonly
di -_b[pc]/(2 * _b[pcsq])
local tp = -_b[pc]/(2 * _b[pcsq])
twoway qfit lnco2pc pc, xli(`tp')

and the output:

Code:

xtpcse lnco2pc pc pcsq lnenu lndcb popg lnfdini lntr k2 y2-y37, hetonly

Number of gaps in sample:  33
note: y36 omitted because of collinearity
note: y37 omitted because of collinearity

Linear regression, heteroskedastic panels corrected standard errors

Group variable:   c_id                          Number of obs     =        508
Time variable:    year                          Number of groups  =         19
Panels:           heteroskedastic (unbalanced)  Obs per group:
Autocorrelation:  no autocorrelation                          min =          8
                                                              avg =  26.736842
                                                              max =         35
Estimated covariances      =        19          R-squared         =     0.8659
Estimated autocorrelations =         0          Wald chi2(42)     =    4214.72
Estimated coefficients     =        43          Prob > chi2       =     0.0000

------------------------------------------------------------------------------
             |            Het-corrected
     lnco2pc |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          pc |   .0005865   .0000398    14.73   0.000     .0005085    .0006646
        pcsq |  -3.19e-08   2.88e-09   -11.08   0.000    -3.76e-08   -2.63e-08
       lnenu |   .4366533   .0678342     6.44   0.000     .3037008    .5696058
       lndcb |   .3442699   .0415652     8.28   0.000     .2628036    .4257363
        popg |  -.0822604   .0384713    -2.14   0.032    -.1576627   -.0068581
     lnfdini |   .0484114   .0157768     3.07   0.002     .0174894    .0793334
        lntr |  -.1261522    .056504    -2.23   0.026     -.236898   -.0154064
          k2 |  -.0288462    .061343    -0.47   0.638    -.1490763    .0913838
          y2 |   .0617816   .1782102     0.35   0.729    -.2875039    .4110671
          y3 |   .1937459    .175764     1.10   0.270    -.1507452    .5382369
          y4 |   .0635031   .1778505     0.36   0.721    -.2850775    .4120837
          y5 |   .1245039   .1829247     0.68   0.496     -.234022    .4830297
          y6 |   .2153634    .185913     1.16   0.247    -.1490193    .5797462
          y7 |   .0394497   .1933122     0.20   0.838    -.3394353    .4183347
          y33 |  -.4327306   .1913891    -2.26   0.024    -.8078465   -.0576148
         y34 |  -.3997159   .1883905    -2.12   0.034    -.7689544   -.0304774
         y35 |  -.3514974   .1903157    -1.85   0.065    -.7245094    .0215146
         y36 |          0  (omitted)
         y37 |          0  (omitted)
       _cons |  -5.327906   .5717428    -9.32   0.000    -6.448501   -4.207311
------------------------------------------------------------------------------

. di -_b[pc]/(2 * _b[pcsq])
9188.5859

. local tp = -_b[pc]/(2 * _b[pcsq])

. twoway qfit lnco2pc pc, xli(`tp')

So, I'm baffled why it isn't producing similar plot this time around.

Also, I tried your code and got this error message:

Code:

xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1

Linear regression, correlated panels corrected standard errors (PCSEs)

Group variable:   c_id                          Number of obs     =        140
Time variable:    year                          Number of groups  =          7
Panels:           correlated (balanced)         Obs per group:
Autocorrelation:  no autocorrelation                          min =         20
                                                              avg =         20
                                                              max =         20
Estimated covariances      =        28          R-squared         =     0.7736
Estimated autocorrelations =         0          Wald chi2(6)      =    1891.41
Estimated coefficients     =         7          Prob > chi2       =     0.0000

---------------------------------------------------------------------------------
                |           Panel-corrected
          lnco2 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
           lnpc |   .6572758   .0212834    30.88   0.000      .615561    .6989906
          popgr |   .2181154   .0488538     4.46   0.000     .1223638     .313867
          renew |  -.0068059   .0013833    -4.92   0.000    -.0095171   -.0040948
             rq |  -.2882291   .0664838    -4.34   0.000    -.4185348   -.1579233
          kofgi |   -.116146   .0327807    -3.54   0.000    -.1803949    -.051897
                |
c.kofgi#c.kofgi |   .0014956    .000385     3.88   0.000      .000741    .0022503
                |
          _cons |  -3.270157   .6360433    -5.14   0.000    -4.516779   -2.023535
---------------------------------------------------------------------------------

. local tp = -_b[kofgi]/(2 * _b[kofgi2])
[kofgi2] not found
r(111);

For the relevant values, do you imply the minimum and maximum values?

Code:

margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here)

How will I input them in the syntax?
Thanks for the guide, appreciated.
Ngozi

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30188
#4

11 Oct 2021, 17:30

What you show in #3 is a different analysis using different variables. When you take a simple model, regressing Y on X and X^2 and add additional variables, anything can happen. The results may change drastically, or not change much at all. It depends on how correlated the added variables are with X and Y. In the example you report in #3, apparently things didn't change much. You were "lucky," in the sense that you were not surprised. But your surprise in what you show in #1 is due entirely to your unwarranted expectation that you would be "lucky" again in these analyses.

For the relevant values, do you imply the minimum and maximum values?
Code:

margins, at(kofgi = (list_of_relevant_values_of_kofgi_goes_here)
How will I input them in the syntax?

The minimum and maximum values might or might not be relevant. They would, in any case, not be sufficient. By relevant I mean values of kofgi that are normally encountered in real world situations. The minimum and maximum in your sample might fit that description--or they might be unusual outliers that people might not care much about. This judgment is not a statistical one--it is a pragmatic one based on knowledge of these variables' real-world meaning and implications. So I can't help you pick them. If you are not confident of your own judgment in the subject matter area, consult a colleague who has experience. With that said, you want a large enough number of values to reasonably fill in the graph. If you used just a low and high value, then the graph -marginsplot- draws would just be a straight line connecting those points.

To illustrate the syntax, suppose that a relevant, interesting range of values is from 5,000 to 15,000. Then the syntax in -margins- would look like this:

Code:

margins, at(kofgi = (5000(1000)15000))

That way -margins- will calculate the expected value of lnco2 at each value of kofgi from 5000 through 15,000 in increments of 1000. That will provide 11 different values of kofgi, which is enough to convey the shape of the curve. By the way, note the parentheses in this code--the code I showed in #2 was incorrect and had unbalanced parentheses.

I notice also that your calculation of the turning point failed in the final regression in #4. That's because you tried to reference _b[kofgi2], but in your factor variable model there is no such variable. You have to use the factor variable notation in -nlcom- as well. So that should be:

Code:

local tp = -_b[kofgi]/(2*_b[c.kofgi#c.kofgi])
Comment

Ngozi ADELEYE

Join Date: Apr 2014
Posts: 80

11 Oct 2021, 21:06

Dear Prof. Schechter,

I have come to say THANK YOU!!!

These codes produced EXACTLY what I needed:

Code:

xtpcse lnco2 lnpc popgr renew rq c.enu##c.enu, rhotype(freg) np1
local tp = -_b[enu]/(2 * _b[c.enu#c.enu])
margins, at(enu = (100(50)1000))
marginsplot, xline(`tp')

and:

Code:

xtpcse lnco2 lnpc popgr renew rq c.kofgi##c.kofgi, rhotype(freg) np1
local tp = -_b[kofgi]/(2*_b[c.kofgi#c.kofgi])
margins, at(kofgi = (10(10)100))
marginsplot, xline(`tp')

May you LIVE long, Prof!
Gracias, Ngozi

Comment

Justine Borja

Join Date: Oct 2021

Posts: 27
#6

04 Nov 2021, 10:45

Hello,

I am doing my bachelor's thesis and I have encountered the same problem in finding the turning point of my EKC. I employed trade openness and trade openness squared as my main explanatory variables against CO2 emission as the measure of environmental quality. My results showed that TO is negative while the TO2 denotes a positive sign, As I have learned from various literature, based on those signs of the coefficients, I can imply that my model portrays a U-shaped curve. This is where the difficulty enters. I want to generate a U-shaped curve in Stata and I do not know the appropriate commands to use. I've tried the commands suggested in this thread, but it shows an inverted U-shaped curve.

I would appreciate all of your insights/responses regarding this. Thanks in advance!

Regards,
Justine
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30188
#7

04 Nov 2021, 11:12

It is unlikely that anybody will be able to help you without seeing the actual commands you tried and the results you got from Stata. Example data, using the -dataex- command would likely be needed as well. Please post back with those. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
3 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35806
#8

04 Nov 2021, 11:19

Justine Borja You are running the same question in two threads. I will post in the other one suggesting arbitrarily that people come here. But in general posting the same question several times doesn't multiply your chances of a good answer, just the scope for a fragmented and repetitive response. One thread at a time please.

Last edited by Nick Cox; 04 Nov 2021, 11:21.
1 like
Comment

Justine Borja

Join Date: Oct 2021
Posts: 27

05 Nov 2021, 10:44

Hello Sir Clyde Schechter. I am testing for the validity of Environment Kuznets Curve in my paper using the ARDL Model. I treated trade openness and trade openness squared as my main variables in relation to CO2 emissions. I am using a time-series data. My results depict the following:

Code:

  ardl lnco2cb_pc lnto lnto2 lnrgdp_pc lnenergy, lags(1 2 2 1 0) ec

Code:

 These are the short-run and long-run parameters:

ARDL(1,2,2,1,0) regression

Sample:     1992 -     2019                     Number of obs     =         28
                                                R-squared         =     0.8664
                                                Adj R-squared     =     0.7879
Log likelihood =  64.159115                     Root MSE          =     0.0314

------------------------------------------------------------------------------
D.lnco2cb_pc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
ADJ          |
  lnco2cb_pc |
         L1. |  -1.033295   .1702565    -6.07   0.000    -1.392505   -.6740853
-------------+----------------------------------------------------------------
LR           |
        lnto |  -.1261742   .0676143    -1.87   0.079    -.2688279    .0164794
       lnto2 |   .0730523   .0372009     1.96   0.066    -.0054347    .1515393
   lnrgdp_pc |   .3476238   .0481941     7.21   0.000     .2459431    .4493045
    lnenergy |   .5848107   .1429328     4.09   0.001     .2832488    .8863725
-------------+----------------------------------------------------------------
SR           |
        lnto |
         D1. |   .2737837   .1169716     2.34   0.032     .0269952    .5205722
         LD. |   .2597734   .1301159     2.00   0.062    -.0147471    .5342939
             |
       lnto2 |
         D1. |   .3707275   .1138614     3.26   0.005     .1305009     .610954
         LD. |   .3233343   .1237917     2.61   0.018     .0621566     .584512
             |
   lnrgdp_pc |
         D1. |   .6200075   .3585193     1.73   0.102    -.1364022    1.376417
             |
       _cons |  -3.058857   .9130884    -3.35   0.004    -4.985305   -1.132409
------------------------------------------------------------------------------

Note that I am only interested on the coefficients of the long-run parameters, lnto and lnto2. The signs suggest that there exist a U-shaped curve. I have calculated the turning point and it is found to be at .86358814 (min - . -1.695541, max - .0979904). I want to visually represent this U-shaped curve to solidify my case.

Thanks,
Justine

Last edited by Justine Borja; 05 Nov 2021, 10:48.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30188
#10

05 Nov 2021, 10:55

-ardl- is a user-written command that I am unfamiliar with. From its description at SSC I see it was written for version 11.2. I do not know if it supports factor variable notation. If it does, then your first step is to re-do the regression properly with factor variable notation:

Code:

ardl lnco2cb_pc c.lnto##c.lnto lnrgdp_pc lnenergy, lags(1 2 2 1 0) ec margins, at(lnto = some_list_of_realistic_values_of_lnto) marginsplot

The list of reaslitic values of lnto should consist of about a dozen values that more or less equally span the range of values of lnto that are commonly observed in real life. If you want to highlight the turning point, then that list should include the value 0.86358814. (If 0.86358814 is not a realistic value for lnto, then your model has suggested that you do not in fact have a quadratic relationship, but rather a somewhat curvilnear relationship, but not one that turns around in real life.)

If -ardl- does not support factor analysis or is not supported by -margins-, you would have to write code that, in effect, emulates what -margins- would do to calculated expected values conditional on the values of lnto. As I don't know what -ardl- actually does, I can't help you with that.
2 likes
Comment
Justine Borja

Join Date: Oct 2021

Posts: 27
#11

05 Nov 2021, 21:23

Unfortunately, the ARDL does not allow for factor variable notation in estimation. Do you know any other way to visually represent a U-shaped curve in the time-series? Thanks!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30188
#12

06 Nov 2021, 10:36

Does the -predict- command run after ardl? If so, I can show you how to wrap that in some code that will get you your quadratic curve.

If not, somebody who knows what -ardl- does and how to calculate predicted values from its results will have to respond. There are probably Forum members who can do that, but they may not be following this thread. If none of them chimes in here within about 24 hours, start a new thread, and make sure to mention ARDL in the thread title.
Comment
Justine Borja

Join Date: Oct 2021

Posts: 27
#13

07 Nov 2021, 01:01

Good day. I have tried for the predict command after the ARDL but I don't know if this is the -predict- command you are referring to.

Code:

ardl lnco2cb_pc lnto lnto2 lnrgdp_pc lnenergy, lags(1 2 2 1 0) ec

Code:

predict lnco2cb_pchat

Is that right sir Clyde Schechter

Thanks,
Justine
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30188
#14

07 Nov 2021, 08:58

Yes, that's the one.

So, something like this. This code is not tested, so you may encounter errors in it, but this is the gist of the approach.

Code:

local values -1.7 (0.1) 0.1 frame create margin_calculations float(lnto expected_value) clonevar lnto_original = lnto clonevar lnto2_original = lnto2 foreach v of numlist `values' { replace lnto = `v' replace lnto2 = `v'*`v' predict pchat summ pchat, meanonly frame post margin_calculations (`v') (r(mean)) drop pchat } replace lnto = lnto_original repace lnto2 = lnto2_original frame change margin_calculations sort lnto graph twoway connect expected_value lnto

You can modify the numbers in local values to reflect what you think are the realistic and interesting range of values of lnto. I rounded off the range you mentioned in #9 to get these, but you might prefer some other set. You can modify the graph itself to suit your preferences using all of the options available with -graph twoway-.
1 like
Comment
Justine Borja

Join Date: Oct 2021

Posts: 27
#15

07 Nov 2021, 12:01

Your help is deeply appreciated! However, i have realized that a negative number is not appropriate for trade openness because there is no such thing as negative index, only 0. Or do I need to consider that it is transformed to logarithm that's why a negative number is present?
Comment

Announcement

Please help! Quadratic Curves Inverting WRONGLY!

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment