Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Spline Regressions

    I am currently working on something similar that Hausman, Pritchett and Rodrik did in the famous paper: Hausmann, Ricardo, Lant Pritchett, and Dani Rodrik. "Growth accelerations." I am having troubles working with spline regressions since I had no exposure to those thus far. I am using the same criteria for identify growth accelerations as they did. They looked for rapid growth episodes that satisfy the following conditions.

    (1) gt,t+n ≥ 3. 5 ppa, growth is rapid,

    (2) gt,n ≥ 2. 0 ppa, growth accelerates,

    (3) yt+n ≥ max{yi }, i ≤t , post-growth output exceeds pre-episode peak.

    According to Hausman at al, "since for some countries there are a number of consecutive years for which these criteria of a growth episode are met, the “best” starting date is chosen by looking for the best fit among all contiguous eligible dates." The timing of the initiation of the growth acceleration is chosen by finding the year that maximizes the F -statistic of a spline regression with a break at the relevant year. I have read a lot on mkspline but one of the problems is that I want to tell Stata to choose the knots instead of assigning them myself.

    I am sorry if the question is too broad, I am overall struggling with figuring out how to code this in Stata and any help would be much appreciated.

  • #2
    This is an interdisciplinary field, so it is best to assume that famous papers in your particular sub-sub-sub-discipline are completely unknown to the people on this list. With this in mind a lot of your question becomes unreadable. For example, since you did not define the symbols, the conditions (1), (2), and (3) are meaningless to us.

    As to your question: From the quote it seems that Hausman et al. choose the knots themselves, i.e. they tried several models with different starting years and chose the one with the largest F-statistic. It is possible to trick nl into finding "optimal" knot positions, but from the quote it does not look like the study you want to replicate did that.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      My regression equation studies the impact of different variables on households’ daily electricity consumption: Cons i,d = a i + … + M*W d + e i,d
      where
      i: individual household
      d: day
      M: vector of coefficients on W d
      W d: spline function controlling for temperature driven shifts in electricity demand

      In particular, to control for temperature driven shifts in electricity demand, the regression equation includes the average daily temperature in a certain region (T d). The daily temperature must enter the model in piecewise linear form with three knot points (at 63F, 70F, and 75F). Specifically, W d represents the following 5 x 1 vector:
      Click image for larger version

Name:	Screen Shot 2019-01-12 at 4.45.21 PM.png
Views:	1
Size:	24.4 KB
ID:	1478498

      I need to code this in Stata. My questions are not about the regression but about the ‘mkspline’ command.

      mkspline T1 63 T2 70 T3 75 T4 = T

      1. The first row of the vector W d has 1. What does this mean? Do I need to change my code somehow to include this 1?
      2. I do not fully understaff the ‘marginal’ option of the ‘mkspline’ command. Do I need to include this option into my code?

      Thank you.

      Comment


      • #4
        Your model is not identified as it contains a contant in \(W_d\) and implicitly also in your main regression equation, and a constant is obviously perfectly correlated with itself. I suspect you copied the formula for \(W_d\) from another text where \(W_d\) is the entire design matrix, i.e. there is not a constant in the regression equation. for example \(y=W_d+\varepsilon\), so there is no constant in the regression equation, so it needs to be in the design matrix. This differs from your model. So, the way you set up the model should not contain the vector of 1s, and this is also how Stata works; it by default includes a constant without you having to specify it. So mkspline works correctly, and you do not have to change anything.

        As to the marginal option, it is just a different representation of the same model. You can choose whichever you find most convenient. You just need to make sure you interpret the model correctly.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Maarten, thank you!

          Comment

          Working...
          X