Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • flag certain observation

    Dear reader,
    I would like to flag every 24th observation starting from fifth observation onwards. The following formula is not working. How should I alter it?

    gen flag=0
    forvalue 1/$N{
    replace flag=1 if (obs==_n+4+_n*28)
    }

    Kind regards
    Last edited by MIchael Jefferson; 23 Feb 2019, 04:58.

  • #2
    For an entire dataset, and roughly what you typed yourself:
    Code:
    gen flag=0
    local N =_N
    forvalues i = 5(24)`N'{
    replace flag=1 if `i'==_n
    }
    Or by some sort of ID:
    Code:
    gen flag = 0
    bys companyID: gen obsno = _n
    replace flag = 1 if (obsno-5)/24==int((obsno-5)/24)
    There's also a loop possible for doing this by companyID but this is quick and easy

    Comment


    • #3
      No loop is needed here, as @Jorit Gosens indicates; and -- if as #1 implies only observation numbers matter -- only one new variable, that desired.

      Code:
      . clear  
      
      . set obs 100
      number of observations (_N) was 0, now 100
      
      . gen flag = mod(_n-5, 24) == 0 
      
      . list if flag 
      
           +------+
           | flag |
           |------|
        5. |    1 |
       29. |    1 |
       53. |    1 |
       77. |    1 |
           +------+
      The loop in #1 is illegal at the start, regardless of what the global macro is defined to be, as it lacks a macro declared as loop counter.

      Comment


      • #4
        I return to the loop in #1.

        The global macro N is not defined for us. Let's guess that earlier you did this

        Code:
        global N = _N 
        so what you would be trying is a loop over observations.

        The variable obs is not defined for us. Let's guess that earlier you did this.

        Code:
        gen obs = _n
        We should not have to guess, naturally. We can only understand your data to the extent that you explain it. Supplying a loop counter (as in #3) would then make the code something like

        Code:
        global N = _N
        gen obs = _n
        gen flag = 0
        forvalue i = 1/$N {
            replace flag=1 if (obs==_n+4+_n*28)
        }
        But then the statement inside the loop just does the same thing again and again, and the loop is thus redundant. So the code simplifies to

        Code:
        gen obs = _n
        gen flag = 0
        replace flag=1 if obs==_n+4+_n*28
        which in turn simplifies to

        Code:
        gen flag = 0
        replace flag=1 if  _n == _n+4+_n*28
        and to

        Code:
        gen flag = _n == _n+4+_n*28
        and to

        Code:
        gen flag = _n == 29 * _n+4
        That condition will never be true, so my guess is (yet another one) is that your variable ended as all zeros.

        Please note the advice in https://www.statalist.org/forums/help#stata on reporting fully your data, your code and the problem(s) you got -- including the implication that "not working" is not informative as a problem report.

        Comment

        Working...
        X