nl hockey - Clustered Standard Errors

Simon Baldwin

Join Date: Jun 2022
Posts: 6

nl hockey - Clustered Standard Errors

15 Jun 2022, 12:06

Dear Statalist community,

I am using StataSE 16 (64-bit) on Windows 10. My question is regarding producing clustered standard errors when using nl hockey, which is a user-written command by Dr Mark Lunt for estimating the breakpoint between two intersecting straight lines (referred to as a piecewise [or "hockey stick"] regression). Please see the ado file:

HTML Code:

https://personalpages.manchester.ac.uk/staff/mark.lunt/nlhockey.ado

and the help file:

HTML Code:

https://personalpages.manchester.ac.uk/staff/mark.lunt/nlhockey.hlp

There is an UCLA tutorial on piecewise regressions in Stata available here:

HTML Code:

https://stats.oarc.ucla.edu/stata/faq/how-can-i-run-a-piecewise-regression-in-stata/

from which I'll be using a version of the talk dataset that I have adapted to this question. Please use the following code to reproduce this dataset:

Code:

. use https://stats.idre.ucla.edu/stat/stata/faq/talk, clear
. expand 2
. sort id
. by id: gen obs = _n
. replace talk = talk + .8474287 if obs == 2 & age <= 11.35291
. replace talk = talk + 3.974196 if obs == 2 & age > 11.35291
. replace age = age + 1 if obs == 2

The dataset contains 4 variables:

id: an anonymous identifier of a child in the dataset
talk: how much the child talks on the phone
age: the age of the child
obs: the repeated observation number

When running a piecewise regression using nl hockey to look at the relationship between how much a child talks on the phone and the age of the child, 4 parameters are estimated:

breakpoint: the breakpoint between two intersecting straight lines modelling the change in talk per year of age
slope_l: the gradient of the straight line to the left of the breakpoint
slope_r: the gradient of the straight line to the right of the breakpoint
cons: the constant (at age = 0)

The code and results are as follows:

Code:

. nl hockey talk age
(obs = 400)

Iteration 0:   residual SS =   1724947
Iteration 1:   residual SS =  48522.02
Iteration 2:   residual SS =  31301.01
Iteration 3:   residual SS =  28998.47
Iteration 4:   residual SS =  28917.09
Iteration 5:   residual SS =  28859.84
Iteration 6:   residual SS =  28763.72
Iteration 7:   residual SS =  28763.47

      Source |       SS       df       MS            Number of obs =       400
-------------+------------------------------         F(  3,   396) =    444.51
       Model |  96860.6041     3   32286.868         Prob > F      =    0.0000
    Residual |  28763.4668   396  72.6350173         R-squared     =    0.7710
-------------+------------------------------         Adj R-squared =    0.7693
       Total |  125624.071   399  314.847296         Root MSE      =  8.522618
                                                     Res. dev.     =   2845.31
(hockey)
------------------------------------------------------------------------------
        talk |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  breakpoint |   12.79033   .6941923    18.42   0.000     11.42557    14.15509
     slope_l |   1.107216   .4397491     2.52   0.012     .2426814    1.971751
     slope_r |   3.999005   .1564209    25.57   0.000     3.691486    4.306525
        cons |   4.656507   4.259092     1.09   0.275    -3.716751    13.02976
------------------------------------------------------------------------------
* Parameter cons taken as constant term in model & ANOVA table
 (SEs, P values, CIs, and correlations are asymptotic approximations)

However, there are 2 repeated observations per child, so it would be appropriate to adjust standard errors for clustering on their id. However, on running nl hockey with clustering on the id variable, I retrieve the following error:

Code:

. nl hockey talk age, cluster(id)
cluster() not allowed
r(198);

I am wondering how I may be able to produce clustered standard errors for results retrieved when using nl hockey. I'll explain what I have tried / looked into before opening the floor to suggestions.

I looked into writing code based on the programmer’s command _robust. Please see the manual:

HTML Code:

https://www.stata.com/manuals/p_robust.pdf

However, it requires variables of the estimated parameters - for which I am not sure what should be contained in them. I wondered whether the model equation would need to be solved for each of the parameters, and the estimates for each parameter entered into each of the corresponding variables. The code is as follows:

Code:

. marksample touse
. markout `touse' id, strok
. nl hockey talk age if `touse'
. matrix D = e(V)
. forvalues i = 1/4 {
.         forvalues j = 1/4 {
.                 matrix D[`i',`j'] = ((e(V)[`i',`j'])*(sqrt(abs(e(V)[`i',`j']))/e(rmse))^2)/abs(e(V)[`i',`j'])
.         }
. }
. matrix b = e(b)
. gen breakpoint = //???
. gen slope_l =    //???
. gen slope_r =    //???
. gen cons =       //???
. local n = e(N)
. local k = colsof(D)
. predict double e if `touse', residual
. sort `touse' id
. by `touse' id: gen byte count = 1 if _n == 1 & `touse'
. summarize count, meanonly
. local n_id = r(sum)
. local dof = `n_id' - 1
. local clopt "cluster(id)"
. ereturn post b D, dof(`dof') esample(`touse')
. _robust e if e(sample), minus(`k') `clopt'
. ereturn display
. drop breakpoint slope_l slope_r cons e count

I also looked into re-writing the program as a substitutable expression or function evaluator program, but this hasn't been successful so far and I would welcome any advice on how to do this from the ado file for nl hockey posted above. Please see the nl manual:

HTML Code:

https://www.stata.com/manuals/rnl.pdf

I look forward to responses.

Best wishes,
Simon Baldwin

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17701
#2

17 Jun 2022, 12:22

Simon:
as per FAQ, you're requested to post your query on the General forum, as this one is for practising purposes only. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment
Simon Baldwin

Join Date: Jun 2022

Posts: 6
#3

20 Jun 2022, 10:26

Thanks Carlo, this was just a practice for some colleagues to preview before posting to the General forum. I will move the post over shortly after making some amendments.

Best wishes,
Simon
Comment

Announcement

nl hockey - Clustered Standard Errors

Comment

Comment