Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looping over observations in a variable

    Hello, I'd like to find out what the code is to loop over observations in a variable.

    My variable contains entries which are not ordered (there are missing values in between them) however I would like to perform a code which loops through each of these observations.

  • #2
    I'm not entirely sure what you are asking for here. In Stata terminology there are no "observations in a variable." Perhaps you mean that you want to perform some set of commands repeatedly, once for each distinct value of the variable, each time using the observations in the data set that have that value for that variable. This is a very common programming situation. Assuming this is what you want to do:

    There are three general ways. Without more detail it is not possible to advise which of them is applicable or best for your situation.

    1. The -by- command. Many single Stata commands can be run under the -by- prefix. -by varname, sort: command- will cause that command to be repeated separately for the observations associated with each value of the variable varname. If what you need to iterate involves more than one command, then it is usually possible to write a program that incorporates those commands and accepts the -by- prefix as well, though this requires a bit of programming expertise.
    2. A generic way to iterate one or more commands over the levels of a variable is:
    Code:
    levelsof varname, local(values)
    foreach v of local values {
        command if varname == `v' // OR `"`v'"' IF varname  IS A STRING VARIABLE
        // YOU CAN HAVE AS MANY COMMANDS, OF ANY KIND, AS YOU LIKE
    }
    3. The -runby- command, written by Robert Picard and me, available from SSC, serves exactly this purpose. The commands to be iterated must be wrapped in a program that takes no arguments and returns nothing--the program is written so as to apply all the commands to a data set containing only a single level of the variable.


    As among these approaches, -by- is the simplest and fastest, but has the narrowest range of application: single commands or programs that are -by-able. (The help file will say whether a program accepts -by- or not.) The second method is applicable to any problem and doesn't require advanced programming skills to use. But if you are working with a large data set it is very, very slow. The third approach is very fast with large data sets, but writing and debugging the iterated program correctly is sometimes tricky.
    Last edited by Clyde Schechter; 05 Oct 2023, 15:35.

    Comment


    • #3
      Thank you very much, the code you provided worked flawlessly :D, apologies for not providing enough detail will be sure to do so next time.

      Comment

      Working...
      X