Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue using GDP and GDP squared in regression analysis

    *** I accidentally uploaded this post twice due to an issue with my computer. Here is the other thread apologies:

    https://www.statalist.org/forums/for...ssion-analysis

    I am currently completing a study on the EKC hypothesis. I am using a time series data set and have taken logs of all the variables. For one part of the study I am calculating the turning point, however to do this, I need to calculate the coefficient of GDP and GDP squared. Every time I use the regress command with my variables on Stata, it omits GDP due to collinearity, therefore, not allowing me to calculate the turning point. Is there a way around this?

    If I have not given enough information please let me know and thank you in advance for any help.

    Last edited by Sabrina Deegan; 06 Jul 2023, 08:02.

  • #2
    how did you generate the squared term?

    try c.lgdp#c.lgdp

    Comment


    • #3
      Really large numbers (like GDP) and this numbers squared are likely to be very hard to distinguish from a linear function of GDP, and are thus removed from the model. The easiest solution is to first center the GDP variable (e.g. subtract the mean) and probably rescale it as well (e.g. divide by 1,000,000,000 if GDP is measured in dollars). After that you can use that new variable and it's square, find the optimum, reverse the rescaling and centering to get the optimum in the original unit.
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        Maarten offers a useful suggestion.

        Also, ln(x) and ln(x2) are collinear. ln(x) and ln(x)^2 are not, but are often quite close to being so.

        Comment


        • #5
          I bet George is correct: You used log(x^2) rather than (log(x))^2.

          Comment


          • #6
            To add to George's and Jeff's comments: Remember that \(\log(c^d) = d \times \log(c)\). A logarithm \(\log_a(b)\) asks to what power do I need to raise \(a\) to get \(b\). So \(\log_2(8)=3\) because \(2^3=8\). We can write \(8\) as \(2^3\), so we can write \(8^2\) as \((2^3)^2\). Writing that out gets \(\underbrace{2 \times 2 \times 2}_{2^3} \times \underbrace{2 \times 2 \times 2}_{2^3} = 2^6\) or \(2^{3\times 2}\). So \(\log_2(8^2) = 2\times \log_2(8) = 2\times 3 = 6\)

            So \(\ln(gdp^2) = 2 \ln(gdp)\), and \(2 \ln(gdp)\) and \(\ln(gdp)\) are just a linear combination of one another and thus perfectly colinear. So if you want to use both the logarithm and the square, you need to first take the logarithm and than the square. So, putting all the advise you received thus far together, your .do file should look something like this:

            Code:
            gen lngdp = ln(gdp)
            sum lngdp
            local m = r(mean)
            gen lngdpc = lngdp - `m'
            
            reg y c.lngdpc##c.lngdpc x1 x2
            
            di "turning point: " exp(-_b[lngdpc]/(2*_b[c.lngdpc#c.lngdpc]) + `m')
            Last edited by Maarten Buis; 08 Jul 2023, 03:55.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              I'd be thinking about using semipar as well.

              Comment


              • #8
                Hi All,

                Thank you so much for all your replies! I did not see this until today and I will have access to Stata tomorrow to apply these comments you all left! I did find to solve my issue, before creating the log of GDP per capita, I centered it, then squared it and then took the log. This fixed the issue of collinearity. I am not sure if I did this in the correct order (or correct at al). I had previously created the square on excel.

                This allowed me to run a regression and calculate a number to find the turning point although from this, I am unsure what the numbers mean.
                Source SS df MS Number of obs = 45
                F(5, 39) = 132.1
                Model 0.406994255 5 0.081399 Prob > F = 0
                Residual 0.024031031 39 0.000616 R-squared = 0.9442
                Adj R-squared = 0.9371
                Total 0.431025286 44 0.009796 Root MSE = 0.02482
                ltotal Coef. Std. Err. t P>t [95% Conf. Interval]
                lgdp -0.0641931 0.007803 -8.23 0 -0.07998 -0.048411
                ltrade -0.1174045 0.05161 -2.27 0.028 -0.2218 -0.0130139
                lfdi 0.0070169 0.008733 0.8 0.427 -0.01065 0.0246819
                lenergy 0.6849436 0.089249 7.67 0 0.504421 0.8654664
                lgdpsquared 0.0089383 0.003061 2.92 0.006 0.002746 0.0151302
                _cons 7.234039 0.877666 8.24 0 5.458792 9.009285

                Here were my results, when I used the formula x= -y1/y2 where x is the turning point, y1 is the coefficient of GDP and y2 is the coefficient of GDP squared I got:
                7.181802

                Not sure what to do with this number of it I have done it correctly, as the turning point should obviously be much higher. Apologies if these seem like stupid questions. I am struggling to wrap my head round this bit.

                Comment


                • #9
                  the formula for the turning point, using your notation, is -y1/(2*y2)

                  Comment

                  Working...
                  X