Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spline Regressions

    I am currently working on something similar that Hausman, Pritchett and Rodrik did in the famous paper: Hausmann, Ricardo, Lant Pritchett, and Dani Rodrik. "Growth accelerations." I am having troubles working with spline regressions since I had no exposure to those thus far. I am using the same criteria for identify growth accelerations as they did. They looked for rapid growth episodes that satisfy the following conditions.

    (1) gt,t+n ≥ 3. 5 ppa, growth is rapid,

    (2) gt,n ≥ 2. 0 ppa, growth accelerates,

    (3) yt+n ≥ max{yi }, i ≤t , post-growth output exceeds pre-episode peak.

    According to Hausman at al, "since for some countries there are a number of consecutive years for which these criteria of a growth episode are met, the “best” starting date is chosen by looking for the best fit among all contiguous eligible dates." The timing of the initiation of the growth acceleration is chosen by finding the year that maximizes the F -statistic of a spline regression with a break at the relevant year. I have read a lot on mkspline but one of the problems is that I want to tell Stata to choose the knots instead of assigning them myself.

    I am sorry if the question is too broad, I am overall struggling with figuring out how to code this in Stata and any help would be much appreciated.

  • #2
    This is an interdisciplinary field, so it is best to assume that famous papers in your particular sub-sub-sub-discipline are completely unknown to the people on this list. With this in mind a lot of your question becomes unreadable. For example, since you did not define the symbols, the conditions (1), (2), and (3) are meaningless to us.

    As to your question: From the quote it seems that Hausman et al. choose the knots themselves, i.e. they tried several models with different starting years and chose the one with the largest F-statistic. It is possible to trick nl into finding "optimal" knot positions, but from the quote it does not look like the study you want to replicate did that.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you for your reply. I apologize that I wasn't specific enough in my question. Yt is GDP per capita and and gt, t+n is the least squares growth rate of GDP per capita. (1), (2), (3), were the conditions used for economics growth to qualify as an acceleration.

      You wrote "i.e. they tried several models with different starting years and chose the one with the largest F-statistic" if you could elaborate more on this I would appreciate it. This is a part that is confusing me. I have never worked with regressions splines before so I am not sure what the authors are doing in that part. Any insight would be helpful.

      I apologize if I'm not specific enough again. I will copy the paragraph from the paper, hoping it might give more insight into what I am trying to replicate.

      "We set the relevant time horizon to be eight years (i.e., n= 7). The timing of the initiation of the growth acceleration is chosen by finding the year that maximizes the F -statistic of a spline regression with a break at the relevant year. That is, since for some countries there are a number of consecutive years for which these criteria of a growth episode are met, the “best” starting date is chosen by looking for the best fit among all contiguous eligible dates. Countries can have more than one instance of growth acceleration as long as the dates are more than 5 years apart (so a country could accelerate from 0 to 3.5 percent in 1967 and then accelerate from 3.5 to 6.0 percent in 1972 as two distinct episodes)."

      Comment


      • #4
        My regression equation studies the impact of different variables on households’ daily electricity consumption: Cons i,d = a i + … + M*W d + e i,d
        where
        i: individual household
        d: day
        M: vector of coefficients on W d
        W d: spline function controlling for temperature driven shifts in electricity demand

        In particular, to control for temperature driven shifts in electricity demand, the regression equation includes the average daily temperature in a certain region (T d). The daily temperature must enter the model in piecewise linear form with three knot points (at 63F, 70F, and 75F). Specifically, W d represents the following 5 x 1 vector:
        Click image for larger version

Name:	Screen Shot 2019-01-12 at 4.45.21 PM.png
Views:	1
Size:	24.4 KB
ID:	1478498

        I need to code this in Stata. My questions are not about the regression but about the ‘mkspline’ command.

        mkspline T1 63 T2 70 T3 75 T4 = T

        1. The first row of the vector W d has 1. What does this mean? Do I need to change my code somehow to include this 1?
        2. I do not fully understaff the ‘marginal’ option of the ‘mkspline’ command. Do I need to include this option into my code?

        Thank you.

        Comment


        • #5
          Your model is not identified as it contains a contant in \(W_d\) and implicitly also in your main regression equation, and a constant is obviously perfectly correlated with itself. I suspect you copied the formula for \(W_d\) from another text where \(W_d\) is the entire design matrix, i.e. there is not a constant in the regression equation. for example \(y=W_d+\varepsilon\), so there is no constant in the regression equation, so it needs to be in the design matrix. This differs from your model. So, the way you set up the model should not contain the vector of 1s, and this is also how Stata works; it by default includes a constant without you having to specify it. So mkspline works correctly, and you do not have to change anything.

          As to the marginal option, it is just a different representation of the same model. You can choose whichever you find most convenient. You just need to make sure you interpret the model correctly.
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment


          • #6
            Maarten, thank you!

            Comment

            Working...
            X