Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unknown cause of "repeated time values within panel" error mesage when bootstrapping

    I have a panel dataset, and I am running the following program in STATA 11.2:

    capture program drop tstep
    program define tstep
    args years obp slg fld age age2 exp exp2 newteam aqual1 aqual2 aqual3 aqual4 season status cat club
    {
    tempvar Tobp Tslg Robp Rslg absAobp absAslg
    xtreg obp L.obp LL.obp age exp, fe
    predict `Tobp'
    gen `Robp'=`Tobp'-obp
    xtreg slg L.slg LL.slg age exp, fe
    predict `Tslg'
    gen `Rslg'=`Tslg'-slg
    gen `absAobp'=abs(`Robp')
    gen `absAslg'=abs(`Rslg')
    xi: regress years `absAobp' `absAslg' $varlist
    }
    end

    It is a two-step process that uses two generated regressors, "absAobp" and "absAslg", in the second stage, hence my need for bootstrapping my standard errors. I've already used xtset and the panel variable is called "agent". The panel is unbalanced, but the year and agent uniquely identify each observation (yes, I've tested this using both xtdescribe and duplicates list).

    In order to bootstrap my program, I use the following command:

    gen newid=agent
    bootstrap, reps(50) cluster(agent) idcluster(newid): tstep years $varlist

    $varlist is a global variable list to cut down on typing time. "newid" and my year variable uniquely identify my panel data after its been bootstrapped, and I've made sure by using the following to test:

    bsample, cl(agent) idcl(newid)
    list year agent newid, sepby(newid)

    I also used duplicates list to make damn sure.

    When I try to run the above bootstrap command, I receive the error message "repeated time values within panel". By the way, if I don't bootstrap, the program runs just fine.

    Many thanks for the help!

  • #2
    Your data don't have duplicates, but your bootstrap sample does. Sampling with replacement means what it says,

    Comment


    • #3
      Thank you Nick, but I am aware of how bootstrapping and sampling with replacement works, so perhaps the issue is with my understanding of the STATA commands.

      It is my understanding that within the bootstrap command the cluster(panel-variable) option communicates to STATA that a panel variable exists within the dataset. This tells STATA to sample the full time-series block associated with that panel variable, not just individual observations. This, of course, is critical for using any type of time-series methods, such as lag operators, fixed effects, etc. As you insinuated above, this then leads to multiple blocks of the same panel variable with its associated time series, thereby corrupting the uniqueness of the panel variable-time variable match.

      The idcluster(new-panel-variable) option, I believe, exists in order to rectify this issue. Its sole purpose is to re-name each block that is sampled with a unique panel-variable value, thus creating a bootstrapped sample that contains observations each uniquely identified by the new-panel-variable and old time value. This allows time-series analysis to be used with the new bootstrapped sample. If my understanding is indeed correct, then my bootstrapped sample doesn't have any duplicates across the newly specified panel-time dimension. This is corroborated by using the duplicates list command after creating a bootstrapped sample, as described in my previous post.

      I may just be misinterpreting what the idcluster() option does, but if not, then I still need an answer for why I am receiving the "repeated time values within panel" error. I also still need some suggestions for how to fix it. Even if my understanding of idcluster() is incorrect, I still need some suggestions for how to rectify my code.

      Hopefully you have some ideas! If anyone else has a clue, please chime in! Many thanks in advance for the help.

      Comment


      • #4
        I'm speculating a bit here, as I have never done a bootstrap analysis of clustered data myself. But looking at [R] bootstrap, p. 232, it says:

        Similarly, when you have panel (longitudinal) data, all resampled panels must be unique
        in each of the bootstrap samples to obtain correct bootstrap estimates of statistics. Therefore,
        both cluster(panelvar) and idcluster(newpanelvar) must be specified with bootstrap, and
        i(newpanelvar) must be used with the main command. Moreover, you must clear the current xtset
        settings by typing xtset, clear before calling bootstrap

        [emphasis added]

        I think that is where your code breaks down. It's worth a try, anyhow.


        Comment


        • #5
          Clyde, I really appreciate the feedback and the idea. I had also read that passage and had previously tried using xtset, clear immediately before running the bootstrap command:

          gen newid=agent
          xtset,clear
          bootstrap, reps(50) cluster(agent) idcluster(newid): tstep years $varlist

          Unfortunately, I got the error message "time variable not set". I should have stated this in my original post, so I apologize for not doing so. (I actually just tried it again, just in case, but it had the same result.)

          Any more speculations or ideas? (fingers crossed)
          Last edited by Tom Zimmerfaust; 19 Apr 2015, 16:24.

          Comment


          • #6
            Even wilder speculation. How about including -xtset newid year- near the top of the tstep program? It seems as if the problem with my previous recommendation is that you are using lag operators--which can't be done unless the time variable has been set. It needs to be set somewhere--and inside tstep seems suitable. (I would, in light of the text in the manual, retain the -xtset, clear- command before the call to bootstrap.)

            This is a kludgy solution (assuming it even works!) because it relies on an occult dependency between the bootstrap call and the program, in that both must agree on the name of the -idcluster- variable. So I would hope that, even if this works, somebody can come up with something better. But I'm hoping that at least this will enable you to begin moving forward.
            Last edited by Clyde Schechter; 19 Apr 2015, 17:04.

            Comment


            • #7
              There's another crucial message in the paragraph that Clyde quoted. Besides xtset(clear), you must also specify i(newid) as an option to xtreg inside the program. Before the invention of xtset,this was the way one specified the panel id to xtreg. Now it is only necessary when one uses a bootstrap prefix; hence its appearance in the bootstrap documentation.

              Code:
              cap program drop myboot
              program define myboot, rclass
              xtreg ln_w grade, i(newid)
              return scalar grade = _b[grade]
              end
              
              webuse nlswork, clear
              xtset, clear
              bootstrap r(grade), cluster(idcode) idcluster(newid): myboot
              As an aside:

              Your statement
              Code:
              gen newid=agent
              was unnecessary, as the idclus(newid) option generates the new variable.Each time a panel is drawn for the bootstrap, it gets a different value for newid.
              Last edited by Steve Samuels; 19 Apr 2015, 18:12.
              Steve Samuels
              Statistical Consulting
              [email protected]

              Stata 14.2

              Comment

              Working...
              X