Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to display org name once if <foreach> with multiple observations?

    Hi Statalist.
    I'm stumped on a what I feel like should be a simple problem. I'm running Stata 18.
    I have a dataset with responses to a survey (60 items) from thousands of people from 1,000 organizations (each with an "ID") over the course of three years. Some organizations log only one response, while others log many. I have identified organizations logging over 50 responses, and want to look at their response pattern over years using a simple <tab>. The output I want is to display in red the name of the organization followed by a one-way tab by year.

    Let's say orgs 234 498 and 1256 are my suspects (although I have many more). The code that isn't working for me is...

    local tagorgs 234 498 1256

    foreach num of numlist `tagorgs' {
    display in red org_name if ID==`num'
    tab year if ID==`num'
    }

    While this does yield <tab by year> for the three orgs (with IDs 234, 498 and 1256), it only displays a single org name that doesn't match with any of those three organizations.

    I'm also concerned that I'm not correctly using a numlist.

    Thanks in advance for any help - both with code, as well as any insight on why this isn't working.

    I did find a workaround (below) but the output is not as tidy, and I have to review this with a client who's not very quant-friendly.
    foreach num of numlist `over50' {
    table org_name year if ID==`num', statistic(freq)
    }
    *

  • #2
    -display- does not perform calculations on variables. Whenever a variable name occurs in -display-, it has to pick a single value to represent that variable. By default, it picks the value appearing in the first observation in the data set. You can override that default by specifying a subscript if you have some other particular observation in mind. But even that doesn't describe your situation. You want it to display the value of org_name that is associated with ID == `num', which presumably occurs in many, but not all, observations in the data set. So you have to first create a local macro containing that value.
    Code:
    foreach num of numlist `tagorgs' {
        levelsof org_name if ID == `num', local(to_show)
        display in red `to_show' if ID==`num'
        tab year if ID==`num'
    }
    By the way, have you considered just doing
    Code:
    by org_name, sort: tab year
    This won't give you the name in red, but each table will be preceded by a header line giving the value of org_name. (N.B. I'm assuming here that there is a one-to-one correspondence between values of org_name and values of ID here.) And if your data set is very large, it will run much faster. (Then again, if your data set is large enough that you will notice the speed difference, the amount of output will be far too voluminous to review with anybody, quant-friendly or not.)

    Comment


    • #3
      Hi Clyde Schechter ~

      Thanks for digging in. I tried both solutions, with an error for the first and the second being not helpful because I have ~20k observations across ~2,300 sites. If you have an additional thought/tweak that would make this run, I would be very appreciative! (And either way, I appreciate your note above.)

      So...
      Code:
       
       by org_name, sort: tab year
      Was not a productive solution...too much scrolling. Hence, the "tagorgs" attempt.


      So I tried what you suggested - I'm hoping there's a small tweak that will resolve the error. Let's include just two orgs, with IDs 411 and 387...
      Code:
      local tagorgs 411 387
      foreach num of numlist `tagorgs' {
          levelsof org_name if ID == `num', local(to_show)
          display in red `to_show' if ID==`num'
          tab year if ID==`num'
      }
      Stata returns the name of the first school (I had been referring to these as orgs, but same thing) both in black with quotes (I assume that's the <levelsof> line) and then the <display in red>, but followed by "if not found" in red, then crashes.

      Click image for larger version

Name:	stata_error.png
Views:	1
Size:	4.2 KB
ID:	1733372

      It may be worth noting that you are correct in your assumptions: (1) there is a 1:1 correspondence between values of org_name and values of ID, and (2) the value of org_name occurs in all observations in my data set.

      Lastly, I looked into the r(111) error, which is "variable not defined". If I run...
      Code:
      local tagorgs 411 387
      foreach num of numlist `tagorgs' {
         tab year if ID==`num'
      }
      ...then I get just what I expect. So it looks like it's crashing in the "di in red" line.

      Any ideas on the "if not found" hitch? Again, either way, thank you for your thought earlier!
      Attached Files

      Comment


      • #4
        Code:
        local tagorgs 411 387
        foreach num of numlist `tagorgs' {
            levelsof org_name if ID == `num', local(to_show)
            display in red `"`to_show'"'
            tab year if ID==`num'
        }
        Sorry, I should have foreseen that error. If you add the additional quotes that I highlighted in red in the above and remove the -if- qualifier from the -display- command, it will run without an error.

        Comment


        • #5
          Nailed it! Thank you so much Clyde Schechter!!!

          Comment


          • #6
            Some related technique. groups is from the Stata Journal.

            Code:
            . sysuse auto, clear
            (1978 automobile data)
            
            . gen id = _n
            
            . groups make if id == 42
            
              +----------------------------------------+
              | make          Freq.   Percent      %<= |
              |----------------------------------------|
              | Plym. Arrow       1    100.00   100.00 |
              +----------------------------------------+
            
            . groups make if id == 42, show(none)
            
              +-------------+
              | make        |
              |-------------|
              | Plym. Arrow |
              +-------------+
            
            . tabdisp make if id == 42, c(id)
            
            ------------------------
            Make and    |
            model       |         id
            ------------+-----------
            Plym. Arrow |         42
            ------------------------

            Comment


            • #7
              As always...
              Thank you Nick! I really appreciate your contributions across the forum.

              Comment

              Working...
              X