String variable values as (part of the) names of scalars

Cruz Echevarria

Join Date: Feb 2017

Posts: 11
#1

String variable values as (part of the) names of scalars

12 Feb 2018, 04:26

Dear Stata users,

I would really appreciate some help on the following issue. Before that, my apologies if it has already been solved somewhere else, and I did not notice it. Suppose that we have a panel data set which contains three variables.
A numerical variable: "ccode" representing the numerical code of each country in the data set [1, 2, 3,....].

A string variable: "cname" representing the name of each country in the data set [Albania, Afghanistan, Algeria,...].

A numerical variable: "y" representing...

The purpose is to generate a collection of scalars: "y_mean_Albania", "y_mean_Afghanistan", "y_mean_Algeria",.... containing the sample mean of variable "y" for each country in the data set.

I have not been able to do it. I have figured out, however, to generate this collection of scalars: "y_mean_1", "y_mean_2", "y_mean_3",.... But his requires to check the list of ccode's and ccname's to see to which country scalar belongs, say "y_mean_i" represents [some of my code is included]

Code:

scalar year_T = 2011; scalar year_F = 1990; scalar year_L = 2016; quiet use MyData.dta, clear; levelsof ccode, local(levels); foreach i of local levels {; quiet drop if year < year_T - 1; quiet drop if year > year_F; quiet drop if ccode != `i'; display `i' " " cname; quiet sum y, detail; scalar y_mean_1_`i' = r(mean); scalar y_mean_2_`cname' = r(mean); quiet use MyData.dta, clear; }; /* This works */ foreach i of local levels {; scalar list y_mean_1_`i'; }; /* This does not work */ scalar list y_mean_2_Albania;

Thanks in advance,

Cruz A. Echevarria
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

12 Feb 2018, 06:13

This line

Code:

display `i' " " cname;

displays cname[1] - the value of the variable cname in the first observation of your data at the time it is run. But this command

Code:

scalar y_mean_2_`cname' = r(mean);

requires a local macro named cname, which you have not defined. The solution is to replace the first command I quoted with

Code:

local cname = cname[1]; display "`i' `cname'";
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35610
#3

12 Feb 2018, 06:42

This code seems very roundabout. You can do the important stuff with just two commands.

As you want the same years in every case at most you need to drop or keep just once. You don't need to keep reading in the same dataset. In fact you don't need to do anything except work with one dataset.

You want to work only on years from 2011 or until 1990? Is that right?

I can't see any advantage in accumulating lots of scalars. If you want to use the results elsewhere use e.g. egen to generate a variable or collapse to produce a new dataset. You just are creating a problem of writing code to do anything useful with said scalars.

summarize, detail is a poor choice if you only want the mean. That's why summarize has a meanonly option.

What's wrong with this?

Code:

use MyData.dta, clear tabstat y if year < 2010 | year > 1990 , s(mean) by(ccode)

Last edited by Nick Cox; 12 Feb 2018, 06:45.
Comment
Cruz Echevarria

Join Date: Feb 2017

Posts: 11
#4

13 Feb 2018, 06:03

Thanks Nick. The word is "efficiency".
Regards,
Cruz
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35610
#5

13 Feb 2018, 06:24

Which code is more efficient then? Not clear on your meaning there.
Comment

Announcement

String variable values as (part of the) names of scalars

Comment

Comment

Comment

Comment