Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping in a panel dataset

    Hi, long time reader, first time poster, I'm using Stata 12 on Mac.

    My dataset is a panel dataset loaded into Stata, with 14 groups and a weekly time variable for 14 years. The data is grouped via 'id' which takes values from 1-14 and the data is in long format.

    I've been experimenting with the loop commands and looking at the manuals, but just cant seem to get the right way for what I want to do. My data is currencies, and I wish to calculate the yearly forward premium i.e. reg y x for each currency (id), which i get to work as:

    foreach i of varlist id {
    reg y x
    }

    But this only seems to run one regression, where as i want it to run a separate regression for each id code. If possible, i'd like it to run per currency code and per year (I've defined a year variable already from 2000-2014) - I can do this manually but this would involve (14*14) 196 different regressions which seems slightly inefficient, given that I'm getting closer to correct way, and there should be a loop code to get this to work.

    Then (I know this may be detracting from the initial issue), is it possible to take the fitted values for each regression and run another regression, taking the 'x' from these regressions and then running a new regression for each year and currency code, i.e. reg x var1 var2 var3. obviously this would need to be incorporated into the loop for Stata to take the fitted values from the previous regression (I may be incorrect here), but that is ultimately what I am trying to do.

    Any help or comments much appreciated - apologies if I've missed anything important on this post, it's my first post.

    Thanks!

  • #2
    Code:
    bysort id: regress y x

    Comment


    • #3
      There are several confusions here. One is that

      Code:
      regress y x
      is unaffected by being within a loop; you will just get the same regression repeated every time around the loop. Another is that

      Code:
      foreach i of varlist id {
      
      }
      is not a loop over the distinct values of the variable id: it is a loop over the variable list id, which is just one variable name, so that loop will be executed just once.

      What you wrote is perfectly legal, however, and as you noticed, just equivalent to regress y x.

      Let's back up.


      Code:
      bysort id year : regress y x
      is the easiest way to run the regressions, except that in your case you want to save the fitted values. I assume that you don't really want to use the values of the predictor x but the fitted or predicted values for y. A start for you might be something like

      Code:
      gen yfitted = .
      egen group = group(id year)
      su group, meanonly
      
      forval g = 1/`r(max)' {
           regress y x if group == `g'
           predict yfit
           replace yfitted = yfit if group == `g'
           drop yfit
           regress yfit var1 var2 var3 if group == `g'
      }
      http://www.stata.com/support/faqs/da...ach/index.html explains more. Otherwise, I don't know what you've been reading but http://www.stata-journal.com/sjpdf.h...iclenum=pr0005 is more discursive than the manual entries on looping.
      Last edited by Nick Cox; 08 Jan 2015, 06:07.

      Comment


      • #4

        Code:
        bysort id year : regress y x
        is the easiest way to run the regressions, except that in your case you want to save the fitted values. I assume that you don't really want to use the values of the predictor x but the fitted or predicted values for y. A start for you might be something like

        http://www.stata.com/support/faqs/da...ach/index.html explains more. Otherwise, I don't know what you've been reading but http://www.stata-journal.com/sjpdf.h...iclenum=pr0005 is more discursive than the manual entries on looping.[/QUOTE]


        Thank you for your kind reply, definitely something much better in the right direction, and thank you for the links! Looking promising.

        Comment


        • #5
          Originally posted by Nick Cox View Post
          There are several confusions here. One is that

          Code:
          regress y x
          is unaffected by being within a loop; you will just get the same regression repeated every time around the loop. Another is that

          Code:
          foreach i of varlist id {
          
          }
          is not a loop over the distinct values of the variable id: it is a loop over the variable list id, which is just one variable name, so that loop will be executed just once.

          What you wrote is perfectly legal, however, and as you noticed, just equivalent to regress y x.

          Let's back up.


          Code:
          bysort id year : regress y x
          is the easiest way to run the regressions, except that in your case you want to save the fitted values. I assume that you don't really want to use the values of the predictor x but the fitted or predicted values for y. A start for you might be something like

          Code:
          gen yfitted = .
          egen group = group(id year)
          su group, meanonly
          
          forval g = 1/`r(max)' {
          regress y x if group == `g'
          predict yfit
          replace yfitted = yfit if group == `g'
          drop yfit
          regress yfit var1 var2 var3 if group == `g'
          }
          http://www.stata.com/support/faqs/da...ach/index.html explains more. Otherwise, I don't know what you've been reading but http://www.stata-journal.com/sjpdf.h...iclenum=pr0005 is more discursive than the manual entries on looping.
          Thanks again for your help, I've managed to get what I want it to do!

          Comment


          • #6
            Dear all,
            I would like to run a regression over every id in a panel dataset and save the residuals to treat them further.
            I tried the following:
            Code:
            gen uhat_ar2=.
            egen group = group(SeriesIdCode)
            summarize group
            forval g = 1/`r(max)' {
                 arima lnretinf_m if group==`g', ar(1/2)
                predict uhat2, resid
                replace uhat_ar2=uhat2 if group==`g'
                drop uhat2
              }
            But I get the following error:
            sample may not include multiple panels
            r(459);

            end of do-file
            What am I doing wrong? My time variable is called "date". If I use it in generating "group", the r(max)=total-number-of-observations-in-the-panel. And this is not what I need... Thank you again for your help.

            Comment

            Working...
            X