Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • String variable values as (part of the) names of scalars

    Dear Stata users,

    I would really appreciate some help on the following issue. Before that, my apologies if it has already been solved somewhere else, and I did not notice it. Suppose that we have a panel data set which contains three variables.
    1. A numerical variable: "ccode" representing the numerical code of each country in the data set [1, 2, 3,....].
    2. A string variable: "cname" representing the name of each country in the data set [Albania, Afghanistan, Algeria,...].
    3. A numerical variable: "y" representing...
    • The purpose is to generate a collection of scalars: "y_mean_Albania", "y_mean_Afghanistan", "y_mean_Algeria",.... containing the sample mean of variable "y" for each country in the data set.
    • I have not been able to do it. I have figured out, however, to generate this collection of scalars: "y_mean_1", "y_mean_2", "y_mean_3",.... But his requires to check the list of ccode's and ccname's to see to which country scalar belongs, say "y_mean_i" represents [some of my code is included]

    Code:
    scalar year_T = 2011;
    scalar year_F     = 1990;
    scalar year_L       = 2016;
    
    quiet use MyData.dta, clear;
    
    levelsof ccode, local(levels);
    foreach i of local levels {;
    
    quiet drop if year < year_T - 1;
    quiet drop if year > year_F;
    quiet drop if ccode != `i';
    display `i' " " cname;
    
    quiet sum y, detail;
    
    scalar y_mean_1_`i' = r(mean);
    scalar y_mean_2_`cname' =  r(mean);
    
    quiet use MyData.dta, clear;
    };
    
    
    /* This works */
    foreach i of local levels {;
    scalar list y_mean_1_`i';
    };
    
    /* This does not work */
    
    scalar list  y_mean_2_Albania;
    Thanks in advance,

    Cruz A. Echevarria

  • #2
    This line
    Code:
    display `i' " " cname;
    displays cname[1] - the value of the variable cname in the first observation of your data at the time it is run. But this command
    Code:
    scalar y_mean_2_`cname' = r(mean);
    requires a local macro named cname, which you have not defined. The solution is to replace the first command I quoted with
    Code:
    local cname = cname[1];
    display "`i'  `cname'";

    Comment


    • #3
      This code seems very roundabout. You can do the important stuff with just two commands.

      As you want the same years in every case at most you need to drop or keep just once. You don't need to keep reading in the same dataset. In fact you don't need to do anything except work with one dataset.

      You want to work only on years from 2011 or until 1990? Is that right?

      I can't see any advantage in accumulating lots of scalars. If you want to use the results elsewhere use e.g. egen to generate a variable or collapse to produce a new dataset. You just are creating a problem of writing code to do anything useful with said scalars.

      summarize, detail is a poor choice if you only want the mean. That's why summarize has a meanonly option.

      What's wrong with this?

      Code:
      use MyData.dta, clear
      tabstat y if year < 2010 | year > 1990 , s(mean) by(ccode)
      Last edited by Nick Cox; 12 Feb 2018, 06:45.

      Comment


      • #4
        Thanks Nick. The word is "efficiency".
        Regards,
        Cruz

        Comment


        • #5
          Which code is more efficient then? Not clear on your meaning there.

          Comment

          Working...
          X