Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • setting graph title equal to variable value in a forvalues loop

    Hi,

    I am trying to create graphs for 71 countries using a fovalues loop. I want the title of each graph to be equal to the name of the country that it represents. I have tried using a local macro to equal the value of the country_name variable within a given group, but it hasn't worked. Here is my code:
    Code:
    #delimit ;
    forvalues i=1/71 {;
    local t= country_name[`i'];
        graph twoway (scatter nkill v2x_libdem if cgroup==`i')
                     (lfit nkill v2x_libdem if cgroup==`i'),
                title("`t'")
                 ...
        graph save lin`i', replace;
    };
    #delimit cr
    The "local t=..." line here would work if i had one observation per country, but i have varying numbers of observations per country.

    I think there is a way to do it using labels, but I haven't found a way to define a label for the variable group without going through one by one.

    Thanks in advance for the help.

    Best,

    Julian

  • #2
    One way to do it:

    Code:
    gen long obs = _n
    
    forvalues i = 1/71 {
        // which observations?
        su obs if cgroup == `i', meanonly
        local t = country_name[r(min)]
    
        scatter nkill v2x_libdem if cgroup==`i' ///
            || lfit nkill v2x_libdem if cgroup==`i', title("`t'")
    
        graph save lin`i', replace
    }
    
    drop obs
    See http://www.stata-journal.com/sjpdf.h...iclenum=dm0025

    Comment


    • #3
      thanks very much Nick! the link was especially helpful.

      i ended up using:

      Code:
      gen long obs= _n
      egen obso=mean(obs), by(country_name)
      gen robso=round(obso, 1)
      
      #delimit ;
      forvalues i=1/71 {;
      levelsof robso if cgroup==`i', local(y);
      local t= country_name[`y'];
          graph twoway (scatter nkill v2x_libdem if cgroup==`i', msymbol(oh) mcolor(gs9))
                       (lfit nkill v2x_libdem if cgroup==`i', range(0 1)lwidth(thin) lcolor(gold)),
                  title("`t'")
                  ...
                  graph save lin`i', replace;
      };
      #delimit cr
      Best,

      Julian

      Comment


      • #4
        So, why did you make it more complicated? You replaced 3 lines with 5. I am genuinely curious about why you think that code is better.

        Comment


        • #5
          it's not that i think it's better, it's just what i understand more easily given the commands i am familiar with. i am still learning how to use stata and am not insulting your code.

          cheers,

          julian

          Comment


          • #6
            Julian: Equally I am just expressing puzzlement, no more, no less.

            Let's work backward from your code towards mine to explain the small differences. Although the graphs are the obvious goal, the technique you needed was how to "look up" the country when you are cycling over different values of an associated numeric variable.

            In fact there is a better way. cgroup is evidently a numeric variable in one to one correspondence with a string variable of country names. If you make the values of country_name the value labels of cgroup, then you can just look up the value labels.

            Here is a silly example all can check.

            Code:
            clear
            input mynum str8 myname
            1   "Rose"
            2   "Joanne"
            3   "Hermione"
            4   "Ginny"
            end
            
            labmask mynum, values(myname)
            
            forval j = 1/4 {
                di "The label for value `j' is `: label (mynum) `j''"
            }
            By construction we have a numeric variable in one-to-one correspondence with a string variable. labmask (Stata Journal) loops over the combinations and maps the values to value labels. Then as you loop over the distinct numeric values you can ask Stata to access the corresponding value label.

            Code:
            search labmask
            
            help extended fcn
            points to download locations for that command and explains the look up syntax respectively.

            Even simpler would be to encode the country names and then use the corresponding numeric variable.

            That aside, let me return to your code and mine. Simplifying and applying some purely cosmetic edits, the nub of your code is

            Code:
            gen long obs = _n
            egen obso = mean(obs), by(country_name)
            gen robso = round(obso, 1)
            
            forvalues i = 1/71 {
                levelsof robso if cgroup == `i', local(y)
                local t = country_name[`y']
                (graph stuff here) 
            }
            We (you and I) start with the same idea.

            Code:
            gen long obs = _n
            putting observation numbers into a variable so we can summarize. The storage or variable type long here could be needed for very big datasets where the observation numbers could be very large.

            It's sufficient to find just one observation # for each group; then we can look up the country name. You have

            Code:
            egen obso = mean(obs), by(country_name)
            gen robso = round(obso, 1)
            in which you are averaging observation numbers and then noticing that the answer can be fractional. Hence the second line, to correct the problem of the first. But there is a further problem: suppose a group is represented in observations 16 and 18; the mean is 17 and that observation contains a different value. Very likely your data are sorted so that this will not bite, but in any event the problem is avoided -- and the rounding problem is avoided too -- by using the minimum or the maximum observation number. So those two lines can be replaced by (say)

            Code:
            egen obso = min(obs), by(country_name)
            So you have pertinent observation numbers in a variable. But if you put them in a new variable, you need to get them out again.

            Code:
            levelsof robso if cgroup == `i', local(y)
            local t = country_name[`y']
            Using levelsof here (a command I have nothing against) will only work if there is a single level to be retrieved. In your case, there is indeed only one, but for more general technique you need a command guaranteed to give a single answer. Otherwise the local macro would contain more than one level and could not be used as a subscript.

            In the code I used I did not see any need to create extra variables such as obso or robso because obs already contains the information needed.

            Inside the loop I summarized obs directly

            Code:
            su obs if cgroup == `i', meanonly
            local t = country_name[r(min)]
            This needs comment because the meanonly option name is misleading. Even under that option some other measures are calculated, including the minimum. For more on that, see e.g. http://www.stata-journal.com/sjpdf.h...iclenum=st0135

            The value of the minimum is accessible after summarize in r(min), which we plug in directly to get the country name we want. In fact even the local macro can be dispensed with as after summarize we could put in the graph call

            Code:
              
             title("`= country_name[r(min)]'")
            although that code is a little intimidating to many beginners.

            I hope that makes some details clearer.
            Last edited by Nick Cox; 04 Aug 2016, 17:51.

            Comment

            Working...
            X