Loop

Olivia Emma

Join Date: Dec 2022

Posts: 102
#1

Loop

06 Apr 2023, 15:35

Hello

I have a panel data of ten 9 countries, I want to run this code for the pair of 4 countries (all possible combinations)

how I can do that please

foreach dep_var in dep_var_1 dep_var_2 dep_var_3 dep_var_4 {
xi: areg `dep_var' independent_var_1 independent_var_2, robust absorb(group_id) cluster(group_id)
sum `dep_var' if e(sample) & independent_var_1 == 0
outreg2 using output_file, excel dec(3) addstat(Mean of Dep Variable, r(mean)) keep(independent_var_1 independent_var_2)
}

and name the outreg file as C1_C2_C3_C4

I want to check how the results are changed if we select the different set of countries

Thanks
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#2

06 Apr 2023, 15:52

I have a panel data of ten 9 countries, I want to run this code for the pair of 4 countries (all possible combinations)

I can't tell if you are talking about ten countries or 9. Also, I can't tell if you want to select pairs (that is, two countries at a time) or quadruplets (four countries at a time). But before suggesting how this might be done, are you aware of just how big a problem this is?

If you have 10 countries and you pick all possible quadruplets, there are 210 such quadruplets. And for each of those you are doing to do 4 regressions, corresponding to the four different dep_vars. So you will be doing 840 regressions. Do you really want to do that? Do you have a plan for working with that much output?

At the other end, if it is 9 countries to start and you are only doing pairs, then there are 36 such pairs, and with 4 regressions for each pair, you have 144 regressions. Again, do you really have a plan for working with the output of 144 regressions?
Comment
Olivia Emma

Join Date: Dec 2022

Posts: 102
#3

06 Apr 2023, 16:07

Yes I want to see how the results change if we change the sample of countries

sorry I have 9 countries please

how I can include the country selection(set of 4 or 3) in the code please
Comment
Olivia Emma

Join Date: Dec 2022

Posts: 102
#4

06 Apr 2023, 16:59

Clyde Schechter Sir can you please give me the idea how to proceed please
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30066
#5

06 Apr 2023, 17:34

OK. With 9 countries selecting sets of 4, there are 126 combinations, each producing 4 regressions, a total of 504 regressions.

Code:

levelsof country, local(countries) local n_countries: word count `countries' forvalues i1 = 1/`n_countries' { local C1: word `i1' of `countries' forvalues i2 = `=`i1'+1'/`n_countries' { local C2: word `i2' of `countries' forvalues i3 = `=`i2'+1'/`n_countries' { local C3: word `i3' of `countries' forvalues i4 = `=`i4'+1'/`n_countries' { local C4: word `i4' of `countries' foreach dv in dep_var1 dep_var_2 dep_var_3 dep_var_4 { areg `dv' independent_var_1 independent_var_2 /// if inlist(country, `"`C1'"', `"`C2'"', `"`C3'"', `"`C4'"'), /// absorb(group_id) cluster(group_id) summ `dv' if e(sample) & independent_var_1 == 0, meanonly outreg2 using `"`C1'_`C2'_`C3'_`C4'_`dv'"', excel dec(3) /// addstat(Mean of Dep Variable, r(mean)) /// keep(independent_var_1 independent_var_2) } } } } }

I am assuming that the countries are identified in the data by a variable called country. Change the first command accordingly if the variable has some other name. I also assume it is a string variable. (In the future, please show example data using the -dataex- command so that such assumptions are not necessary. If my assumption about country being a string is wrong, the code will not work and we will have both wasted time. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data. Help those who want to help you.

I have made a few changes to your original code. First, I have eliminated the -xi:- prefix on the -areg- command. -xi- is now almost entirely obsolete, having been replaced by factor variable notation. While there are a few archaic commands that do not allow factor-variable notation, for most of them there are more modern commands that will do the same thing and do support factor-variable notation. Also, as your -areg- command was written, -xi:- didn't do anything because none of the variables was prefixed with i. Second, I have for brevity replaced dep_var by dv. Third, I removed the -robust- option from the -areg- command because the -cluster(group_id)- option implies robust. Fourth, I have added the -meanonly- option to the -summ- command. Since you do not use the standard deviations that -summ- produces, there is no reason to waste the extra time involved in computing them. In a single run of -summ- the extra time is imperceptible, but in a loop that will run 504 times it might be noticeable. Finally, I have modified the -outreg2- command so that it will use the output file name you want, except that I also include the name of the dependent variable in the filename. Had I not done that, the outputs for the first three regressions on each set of 4 countries would have been overwritten and lost.

Bear in mind also that I do not use the -outreg2- command and I'm not really familiar with its workings. So if there were anythings wrong with the -outreg2- command as you wrote it, you will still have the same problems with it.
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#6

06 Apr 2023, 17:49

I have a bit of a different question. As Clyde notes, you run 500 regressions. Well, 504 but who's counting. Stata, I guess!

Anyways: why so many? What're all these estimates meant to demonstrate or accomplish? I'm not saying this is bad, I should say. I can think of instances where hundreds or thousands of estimates are sensible. But... why here? How will you present so these, when the time comes to do so? How will you even make sense of all the results you do get? Graphical methods? What's the underlying story these estimates should tell?
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment