Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Syntax error forvalues and two if statements

    Hi all,

    I'm not too experienced with coding or with STATA.

    For an experiment I'm currently generating new variables, let's say 'diffcont1995' for which I want to know what the difference is in variable 'varcr' in fiscal year 1995 compared to the base year (1994) for each unique company (gvkeyind). I'm doing the same for 'diffcont1997', 'diffcont1999' and 'diffcont2001'.

    The same goes for variables diffsoxXXXX but these have base year 2002, and variables 'diffdoddXXXX have base year 2010.

    I have tried different solutions but I'm unsuccessful to get the results that I want. Does anybody know how I can best solve this?

    Code:
    gen diffcont1995=.
    gen diffcont1997=.
    gen diffcont1999=.
    gen diffcont2001=.
    gen diffsox2003=.
    gen diffsox2005=.
    gen diffsox2007=.
    gen diffsox2009=.
    gen diffdodd2011=.
    gen diffdodd2013=.
    gen diffdodd2015=.
    
    sum gvkeyind
    scalar max2=r(max)
    local k=max2
    set more off
    forvalues i=1(1)`k'{
    
    replace diffcont1995=l1.varcr-varcr if fyear=1995 & gvkeyind==`i'
    replace diffcont1997=l3.varcr-varcr if fyear=1997 & gvkeyind==`i'
    replace diffcont1999=l5.varcr-varcr if fyear=1999 & gvkeyind==`i'
    replace diffcont2001=l7.varcr-varcr if fyear=2001 & gvkeyind==`i'
    replace diffsox2003=l1.varcr-varcr if fyear=2003 & gvkeyind==`i'
    replace diffsox2005=l3.varcr-varcr if fyear=2005 & gvkeyind==`i'
    replace diffsox2007=l5.varcr-varcr if fyear=2007 & gvkeyind==`i'
    replace diffsox2009=l7.varcr-varcr if fyear=2009 & gvkeyind==`i'
    replace diffdodd2011=l1.varcr-varcr if fyear=2011 & gvkeyind==`i'
    replace diffdodd2013=l3.varcr-varcr if fyear=2013 & gvkeyind==`i'
    replace diffdodd2015=l5.varcr-varcr if fyear=2015 & gvkeyind==`i'
     
    di `i' " / " `k'
    }
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float gvkeyind int fyear float varcr
     1 1994   0
     1 1995   0
     1 1996  -1
     1 1997  -1
     1 1998  -1
     1 1999   0
     1 2000   0
     1 2001   0
     1 2002   0
     1 2003  -1
     1 2004  -4
     1 2005  -4
     1 2006  -4
     1 2007  -4
     1 2008  -3
     1 2009  -3
     1 2010  -3
     3 1994   0
     3 1995   0
     3 1996  -5
     3 1997  -5
     3 1998  -5
     3 1999  -5
     3 2000  -5
     3 2001   0
     3 2002   0
     3 2003   0
    10 1998   1
    10 1999   1
    10 2000   1
    10 2001   1
    10 2002   1
    10 2003   1
    10 2004   0
    10 2005  -1
    10 2006  -1
    10 2007   0
    11 2000   5
    13 1998  -4
    13 1999  -4
    15 1994  -2
    15 1995  -2
    15 1996  -2
    15 1997  -2
    15 1998  -1
    15 1999  -1
    15 2000  -1
    15 2001  -1
    15 2002  -3
    15 2003  -7
    15 2004 -10
    15 2005 -10
    15 2006 -10
    15 2007  -6
    15 2008  -6
    15 2009  -7
    15 2010  -7
    16 1994  -1
    16 1995  -1
    16 1996  -1
    16 1997   0
    16 1998   5
    16 1999   5
    16 2000   6
    16 2001   6
    16 2002   1
    16 2003  -3
    16 2004  -6
    16 2005  -7
    16 2006  -6
    18 1994  -4
    18 1995  -5
    18 1996  -5
    24 1994  -1
    24 1995  -1
    24 1996  -1
    24 1997   0
    24 1998   0
    24 1999   0
    24 2000   0
    24 2001   0
    24 2002   0
    24 2003   0
    24 2004   0
    24 2005   0
    24 2006  -1
    24 2007  -1
    24 2008  -1
    24 2009  -1
    24 2010  -1
    26 1994  13
    26 1995  13
    26 1996  13
    26 1997  12
    26 1998  12
    26 1999  12
    26 2000  12
    26 2001   5
    26 2002  10
    26 2003  10
    end

  • #2
    Thanks for posting a data example. Here is code I wrote that starts from there:

    Code:
    * specific example
    
    egen val1994 = total((fyear == 1994) * varcr), by(gvkeyind)
    gen diff1994 = varcr - val1994
    
    * illustration for other base years
    
    foreach base in 1997 2002 {
        egen val`base' = total((fyear == `base') * varcr), by(gvkeyind)
        gen diff`base' = varcr - val`base'
    }
    What's wrong with your code?

    1. There is no need to create so many variables for a given base year.

    2. There is no need to loop over firms: what differs between firms are the base values, which can be calculated in one line using egen.
    Expressions such as fyear == 1994 are true (evaluates as 1) at most once for each panel.
    See also http://www.stata-journal.com/sjpdf.h...iclenum=dm0055

    3. You don't tell us where the syntax error arises but my guess is that it is that you need expressions like

    Code:
      
     if fyear==1995
    The single = operator is not used for comparisons.

    For "unique" read "distinct" (http://www.stata-journal.com/sjpdf.h...iclenum=dm0042) and for "STATA" read "Stata" (http://www.statalist.org/forums/help#spelling)

    Comment


    • #3
      Hi Nick, thank you for your quick reply.

      I agree with all except for point 1. I need these specific variables in order for me to regress them against a dependent variable.

      Would you be able to modify the 'foreach' example to calculate them for these specific variables?

      Looking forward to your reply.

      Comment


      • #4
        I very much doubt your claim: You really shouldn't want separate variables for each firm, as they won't be defined for other firms and your sample size will be reduced to zero.

        What you may want instead is just to use factor variable notation as appropriate

        -- or just to use the information as one variable.
        Last edited by Nick Cox; 24 Jul 2016, 09:35.

        Comment


        • #5
          I don't need a separate variable for each firm. I need a variable that contains all the variances for a specific year as the variable names are indicating. See the variablelist below.

          So diffcont1995 would contain only variances from observations from 1995 compared to 1994, diffcont1997 from 1997 compared to 1994, and so on for diffcontXXXX.
          Diffsox2003 would contain only variances from 2003 compared to 2002, Diffsox2005 from 2005 compared to 2002, and so on for diffsoxXXXX.
          Diffdodd2011 would only contain variances between 2011 compared to 2010, diffdodd2013 variances from 2013 compared to 2010.
          This is to obtain regression outputs for these specific years 1995, 1997, 1999, 2001 ... 2015 for three different periods (period with base year 1994, 2002 and 2010).

          Please let me know if you can help me generate these 11 variables with the needed variances.

          diffcont1995
          diffcont1997
          diffcont1999
          diffcont2001
          diffsox2003
          diffsox2005
          diffsox2007
          diffsox2009
          diffdodd2011
          diffdodd2013
          diffdodd2015
          Last edited by Joshua Tievoor; 24 Jul 2016, 10:28.

          Comment


          • #6
            As far as I can see you just want to separate individual variables into several variables, and there is a command for that, separate.

            Fond though I am of that command, I fail to see that you need it at all for regression purposes.

            If you want separate regression results for different years, just use statsby.

            In a statistical forum, by default "variance" means variance, not difference. In an accountancy forum, it might be the other way round.

            Comment


            • #7
              Hm, okay. Indeed your solution in combination with 'statsby' seems what I'm looking for.

              How would I have Stata best display the years mentioned above?

              I'm trying as below but keep getting 'no; data in memory would be lost', 'r(4);' error. How would I fix this?

              Code:
              statsby, by(fyear): reg aac diff1994 size lev mtb fin acquis litit
              In accounting a variance would have the same definition if I'm not mistaken, I call it 'difference' in this example because it is a difference between two variances?

              Comment


              • #8
                You need the clear option of statsby

                You are calculating differences between values and in statistics such differences are never called variances. I've seen that terminology in discussions of accounts.

                Comment

                Working...
                X