Ordering Multiples variables

Mario Ferri

Join Date: Jul 2019
Posts: 190

Ordering Multiples variables

26 Jul 2022, 15:12

I am having a dataset in wide format for almost 80 years for 100 countries for about 200 variables. Variables bring the 3-letter ISO country code as their first component. I would need to reorder the variables in two different ways, with probably two loops.

The first is to order everything, grouping by each variable separately. In the example, below, that will be like:

Code:

 
  ts FRA_u DEU_u GBR_u USA_u FRA_size  DEU_size  GBR_size USA_size FRA_Δprdm DEU_Δprdm GBR_Δprdm USA_Δprdm

And the second to order everything by country code, similar to :

Code:

ts FRA_u FRA_size FRA_u FRA_Δprdm DEU_u  DEU_size DEU_Δprdm GBR_u GBR_size GBR_Δprdm  USA_u USA_size USA_Δprdm

I will need now two different reordering loops in order to produce my final results.
For more than 100 countries, a reordering for each one is time-consuming

The following is an example sample created just for this purpose. Variables do follow a random order, and as said above, there are more than 200 for more than 100 countries

I have been greatly benefited from this forum every time I posted a question and cannot thank enough!

Best,
Mario

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(ts DEU_u DEU_size DEU_Δprdm FRA_u FRA_size FRA_Δprdm GBR_u GBR_size GBR_Δprdm USA_u USA_size USA_Δprdm)
2000  7.92 19.044518         0 10.22  22.32587  .08333331 5.56 16.809107        0 3.99  14.02601         0
2001  7.77 18.948927         0  8.61 22.141853   .1547619  4.7 17.468409        0 4.73 14.528614         0
2002  8.48  19.23157 .08333331   8.7  22.72115          0 5.04 18.273579        0 5.78 15.041862         0
2003  9.78  19.32564         0  8.31 23.143427          0 4.81  18.95393 .0714286 5.99 15.246542         0
2004 10.73  18.81738         0  8.91 23.044697          0 4.59 19.572525        0 5.53  15.16888         0
2005 11.17  18.77591 .08333337  8.49  23.06704  .23809522 4.75 19.789913        0 5.08 15.055298 .08333331
2006 10.25  18.32777         0  8.45  22.75918  .08333337 5.35 19.919577        0 4.62  15.00832         0
2007  8.66 17.855055 .08333331  7.66  22.43314 -.14285713 5.26  19.74264        0 4.62 15.214797         0
2008  7.52  18.25556         0  7.06 22.562563          0 5.62 20.502636        0 5.78  15.98871         0
2009  7.74   19.9917         0  8.74 24.080805  .14285713 7.54   22.0684        0 9.25 16.822023         0
2010  6.97  19.56321         0  8.87 23.989256          0 7.79 21.641914        0 9.63 16.743158         0
end

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#2

26 Jul 2022, 15:23

If I understand you correctly, you are working with a data set that has 100 x 200 = 20,000 variables and are looking to reorder it in two different ways. I honestly cannot imagine what can usefully be done with such a data set, in any order, nor why you would want to work with it in this wide layout.

My approach to a data set like this would be to instantly -reshape- it into long layout (i.e. standard panel data) and do all further data management and analysis from there.

Code:

ds DEU_* local stubs `r(varlist)' local stubs: subinstr local stubs "DEU_" "@", all reshape long `stubs', i(ts) j(country) string encode country, gen(ncountry) xtset ncountry ts
1 like
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#3

26 Jul 2022, 15:43

I agree with Clyde's advice.

The experienced users here generally agree that, with few exceptions, Stata makes it much more straightforward to accomplish complex analyses using a long layout of your data rather than a wide layout of the same data. You should try to achieve what you need with the data organized as it was before you reshaped it to a wide layout at

https://www.statalist.org/forums/for...ables-renaming

and seek the help of Statalist if there are tasks you cannot figure out how to easily accomplish with a long layout. The sort of problems you will encounter trying to use your wide data will almost certainly be solved by reshaping the data back to long.
Comment
Mario Ferri

Join Date: Jul 2019

Posts: 190
#4

26 Jul 2022, 15:57

Originally posted by Clyde Schechter View Post

If I understand you correctly, you are working with a data set that has 100 x 200 = 20,000 variables and are looking to reorder it in two different ways. I honestly cannot imagine what can usefully be done with such a data set, in any order, nor why you would want to work with it in this wide layout.

My approach to a data set like this would be to instantly -reshape- it into long layout (i.e. standard panel data) and do all further data management and analysis from there.

Code:

ds DEU_* local stubs `r(varlist)' local stubs: subinstr local stubs "DEU_" "@", all reshape long `stubs', i(ts) j(country) string encode country, gen(ncountry) xtset ncountry ts

Thanks for the suggestion. My motivation is of two type, why I want to work with that dataset

First, it is part of the task to produce a dataset of the general.

Second, I am having a code in matlab for a procedure which stata is not able to perform, yet Format needs to be in wide as in #1.

The variables are within the stata matrix size, in the dataset, and already in wide stata format. I just need to do a reordering as in #1.

So, to keep it short, no need to produce new variables, reshape in long or other. Just a simple reordering loop.

Thank you very much for your time and help

Mario

Last edited by Mario Ferri; 26 Jul 2022, 16:09.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#5

26 Jul 2022, 17:03

First, it is part of the task to produce a dataset of the general.

OK, but the long data set is just as complete as the wide version, and is at least usable within Stata itself.

Second, I am having a code in matlab for a procedure which stata is not able to perform, yet Format needs to be in wide as in #1.

OK, this is a reason for doing what you ask. It seems from your description that all that matters is the grouping by country and the grouping by attribute (variable), but to assure greater consistency, I have also sorted the countries and the groupings alphabetically within their opposite groups.

Code:

// GROUPING SEPARATELY BY VARIABLE ds DEU_* local vbles `r(varlist)' local vbles: subinstr local vbles "DEU_" "", all local vbles: list sort vbles local n_vbles: word count `vbles' foreach i of numlist `n_vbles' (-1) 1 { local var: word `i' of `vbles' order *_`var', alpha after(ts) } // GROUPING SEPARATELY BY COUNTRY ds *_u local countries `r(varlist)' local countries: subinstr local countries "_u" "", all local countries: list sort countries local n_countries: word count `countries' foreach i of numlist `n_countries' (-1) 1 { local ctry: word `i' of `countries' order `ctry'_*, alpha after(ts) }
Comment
Mario Ferri

Join Date: Jul 2019

Posts: 190
#6

26 Jul 2022, 18:17

Originally posted by Clyde Schechter View Post

OK, but the long data set is just as complete as the wide version, and is at least usable within Stata itself.

OK, this is a reason for doing what you ask. It seems from your description that all that matters is the grouping by country and the grouping by attribute (variable), but to assure greater consistency, I have also sorted the countries and the groupings alphabetically within their opposite groups.

Code:

// GROUPING SEPARATELY BY VARIABLE ds DEU_* local vbles `r(varlist)' local vbles: subinstr local vbles "DEU_" "", all local vbles: list sort vbles local n_vbles: word count `vbles' foreach i of numlist `n_vbles' (-1) 1 { local var: word `i' of `vbles' order *_`var', alpha after(ts) } // GROUPING SEPARATELY BY COUNTRY ds *_u local countries `r(varlist)' local countries: subinstr local countries "_u" "", all local countries: list sort countries local n_countries: word count `countries' foreach i of numlist `n_countries' (-1) 1 { local ctry: word `i' of `countries' order `ctry'_*, alpha after(ts) }

Yes, all that matters is the grouping by country and the grouping by attribute (variable). The code provided works like a charm!

Thank you so much for one more time, Clyde.

Always have I said, this is an excellent forum composed of awesome people willing to offer their advice patiently for free.

I have been greatly benefited from this forum every time I've posted a question and cannot thank enough!

Thank you wholeheartedly for everything!
Best,
Mario
Comment

Announcement

Ordering Multiples variables

Comment

Comment

Comment

Comment

Comment