Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Urgent help please! Using forvalues for nested loop regression by firm and year

    Hi there,

    I am running a market model by using daily returns from year 2006 to 2014 with 36 firms. I want to get the residuals of the loop regressions by firm and year.

    It used to work when I use monthly returns, but not work this time!! My code is below:

    gen resi=.
    forval y= 2006/2014 {
    forval i=1/36 {
    reg dretwd marketreturn if stkcd=='i' & year=='y'
    predict r if stkcd=='i' & year=='y', resid
    replace resi = r if stkcd=='i' & year=='y'
    drop r
    }
    }

    Then stata shows 'i' invalid name.

    Can anyone please help me out?

    I also tried bysort function, stata ran 36 regressions, but I don't know how to get the residuals. egen code = group(stkcd year), bysort code : reg dretwd markereturn.

    Thanks in advance.

    Regards,
    Jennifer

  • #2
    No, this code did not work when you had monthly data. This code would never work: it is riddled with syntax errors. What probably happened is that you either incorrectly re-typed existing code that had worked incorrectly, or perhaps you "laundered" it through a word processing program that changed some of the characters. (Code that is exported to Microsoft Word, then copied and pasted back into Stata often fails for just this reason.)

    Here's the specific problem. All of the references to 'i' and 'y' are syntax error. The correct way to dereference local macros is `i'. The left-quote is a different character from the right quote; it is slanted, not vertical. (Your code uses right quotes on both sides.) On a US keyboard, the left quote character is found on the key to the left of the 1! key. Once you correct all of the left quotes, I believe the code will run properly.

    Comment


    • #3
      Well, the output is telling you exactly what the problem is. It ran through the first two regressions (not just one). Then on the third round, firm 3 in year 2006 it found no observations. So that is exactly where your problem is: there are no observations for firm 3 in year 2006 that can be included in the estimation sample. In other words, you have no observations for firm 3 in year 2006 that have non-missing values for both dretwd and marketreturn. You will be able to verify that directly by running -summ dretwd marketreturn if year == 2006 & stkcd == 3-.

      As for what to do about it, that depends. It may be that your data set is incorrect and there should be such observations. If that is the case, you have to fix your data set by supplying the needed observations (or replacing the missing values with non-missing values). If the data set is incorrect in this way, it is likely to have other inappropriately missing data as well, and you should look for all of that and fix it all before proceeding.

      Or, it may be the case that there really aren't supposed to be any data on firm 3 in year 2006. In that case, the code needs to be rewritten to allow for that. If this is the case, it is likely that there will also be other firm/year combinations for which there will be no observations (or just 1 observation, which would also cause Stata to complain and halt.) The trick here is the judicious use of -capture- to allow Stata to continue in the face of this expected problem. The operative term her is judicious: you don't want to just blunder through if there is some problem other than the anticipated possibility of insufficient numbers of complete observations. So we write the code so that after the regression is done, we check the return code, c(rc). If the return code is zero that means that the regression command executed without problems, and we should go ahead and calculate the residuals. If it is 2000 or 2001 then we know that Stata found either no observations at all (2000) or too few observations for the number of variables in the model (2001). Since these are anticipated problems we just move on to the next firm-year combination. If the return code is anything else, then there is some other problem that we did not expect. In that case we program so that we get an error message telling us the values of stkcd and year that triggered this error condition, and we replay the error message itself, and then stop the program.

      Code:
      gen resi = .
      
      forvalues y = 2006/2014 {
          forvalues i = 1/36 {
              capture regress drretwd marketreturn if stkcd == `i' & year == `y'
              if c(rc) == 0 {    // SUCCESSFUL REGRESSION
                  predict r, resid
                  replace resi = r if stkcd == `i' & year == `y'
                  drop r
              }
              else if !inlist(c(rc), 2000, 2001) {    // ERROR OTHER THAN NO OR INSUFFICIENT OBSERVATIONS
                  display in red "Unanticipated error: stkcd = `i', year = `y'"
                  error c(rc)
              }
          }
      }
      Before you run this code, be sure to read the manual section on -capture- (which will also explain c(rc)) so you know what it is doing and understand how it works.
      Last edited by Clyde Schechter; 26 Mar 2017, 22:52.

      Comment


      • #4
        Re #4: you can't calculate and store the residuals using that approach. See my response in #5, which crossed with #4.

        Comment


        • #5
          It appears you did not correctly copy the code from #5 and paste it into your program, because the message tells you there was a syntax error in the regress command on the very first regression. Carefully compare your program to #5 and see where you have gone wrong on the regress command. Or copy the command from #5 and paste it into your program, replacing the regress comman you have.

          Comment


          • #6
            First of all, based on what you have said, it appears that your "year" variable is not a Stata Internal Format year, but rather something else.

            Before working with dates and times, any Stata user should thoroughly review the very detailed Chapter 24 (Working with dates and times) of the Stata User's Guide PDF. After that, the help datetime documentation will usually be enough to point the way. All Stata manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through the PDF Documentation section of Stata's Help menu.

            Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. See especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using CODE delimiters, as described in section 12 of the FAQ.

            Show us the results of
            Code:
            describe year
            codebook year
            In posting these results, please copy them from the Results window or your log file into a code block in the Forum editor, as discussed above. For example, the following:

            [code]
            // sample code
            sysuse auto, clear
            describe
            [/code]

            will be presented in the post as the following:
            Code:
            // sample code
            sysuse auto, clear
            describe

            Comment


            • #7
              OK, I see the error in #5. The variable name dretwd was mistyped as drretwd. Sorry about that. Just fix that typo and it should run properly.

              An important lesson for anyone reading this: if you post a problem and don't give an example data set, it is usually not possible to test out the code. Typos, particularly on variable names that have no mnemonic value or meaning to the person helping out (I have no clue what dretwd might mean or abbreviate), are easily made. If I had a sample of the data available when I wrote #5, I would have tested the code on it, and that error would have been found and corrected before the code got posted.

              So, anyone posting a question who wants help with code should post example data to increase the likelihood that the first response will be helpful. The most helpful way to do this is with the -dataex- command. Install -dataex- by running -ssc install dataex-. Then run -help dataex- to read the simple instructions. By using -dataex- you provide those who wish to help you with a completely faithful replica of your Stata data set (or the part of it you include in your -dataex- example), that can be imported to Stata by a simple copy/paste operation.

              When posting example data, you should pick a representative subset of your data set (unless the whole thing is very small). Include enough observations that the full range of situations the code needs to handle are instantiated, and be sure to include all variables that are needed for the calculations desired.
              Last edited by Clyde Schechter; 27 Mar 2017, 20:14.

              Comment

              Working...
              X