Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loop within a loop?

    Hi,

    I'm aware this potentially may be quite trivial so apologies in advance. I am running the below command, I would like it to be more efficient. I'm familiar with setting a global macro of all the sport and then running a loop but this creates a A-Z variables for each sport which I do not want...only want A to correspond to Football, B to Cricket ...

    gen A=0
    replace A=1 if sport=="Football"

    gen B=0
    replace B=1 if sport=="Cricket"

    gen C=0
    replace C=1 if sport=="Rugby"

    ....

    gen Z=0
    replace Z=1 if sport=="Snooker"


    Any assistance would be much appreciated! Many thanks in advance.

  • #2
    Well, I'd use better variable names. But, setting that aside, this should work for your example. There are other approaches too. The big deal is that this problem is not at all a loop within a loop as usually understood in programming. It's one loop over two or more lists in parallel.

    Code:
    local sports Football Cricket Rugby 
    
    foreach v in A B C { 
         gettoken sport sports : sports 
         gen `v' = sport == "`sport'" 
    }
    Note that the generate/replace approach is two lines where one will do. I don't know who or what is recommending this long-winded approach when better ones are documented. See e.g. https://www.stata.com/support/faqs/d...rue-and-false/

    The only qualification is that you may want indicators that are 0, 1 or missing in which case

    Code:
    gen `v' = sport == "`sport'"  if !missing(sport)
    keeps the core code down to one line.

    The most obvious complication is that the elements may not be single words. For example, my sport is "Stata programming". I will let you pose a more realistic example if this doesn't answer your question.

    Comment


    • #3
      In the interest of expanding Stata techniques available to you, here's an alternative:
      Code:
      clear
      input str20 sport
      "Football"
      "Cricket"
      "Rugby"
      "Snooker"
      end
      // The generate option on tab1 creates a set of sequentially numbered binary variables.
      tab1 sport, generate(sportB)
      describe
      // If you don't like names like sportB1, sportB2, you can rename them to A B C ...
      rename (sportB*) (A B C D)

      Comment


      • #4
        Hi Nick Cox ,

        Thanks for your response. This is not a copy of my code but I had to majorly simplify it due to confidentiality. The code above does not seem to work unfortunately and comes up with an error when running the loop:

        sport not found
        r(111);

        end of do-file

        r(111);

        In fact, it would be very helpful to know a way that can be used for sports with more than 2 words as this is the case in my actual dataset.

        E.g.

        gen A_IND=0
        replace A_IND=1 if a_soc=="Problems with Intestine"

        gen B_IND=0
        replace B_IND=1 if a_soc=="Psychiatric disorders"

        ...
        Last edited by Kay Vee; 17 Jul 2018, 08:11.

        Comment


        • #5
          The first part of your question is easy to answer. #1 implied and #2 inferred that you have a variable called sport. If you don't then the code will fail. Otherwise put, you need to change the variable name to one you have.

          The second part of your question ignores my advice in #2, expanded in the reference I gave. Regardless of that, if you have a long list of complicated names, the approach in #1 and #2 should probably be abandoned rather than modified.

          Code:
          tab a_soc, gen(a_soc)
          will create a bundle of indicators, as Mike has already explained. It may be that you don't need any of them if you can use factor variable notation.

          Comment

          Working...
          X