How to insert variable A's value as new rows of variable B

Chen Samulsion

Join Date: Jan 2018
Posts: 926

How to insert variable A's value as new rows of variable B

19 Aug 2019, 22:17

Dear Stata users,

I have a question about data management. Data example is as below. Suppose we have two variable, -abbr- is abbreviation of country name, and the corresponding -name- is full name of country. For some practical purpose, I want to insert values of variable -name- into variable -abbr- as new rows. Thus, the result should be as the second code below. I cannot find a soultion by resorting to command -expand- or -stack-, is there some one would help me? Thank you.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 abbr str20 name
"AD" "Andorra"            
"AE" "United Arab Emirates"
"AF" "Afghanistan"        
"AG" "Antigua and Barbuda"
"AI" "Anguilla"            
"AL" "Albania"            
"AM" "Armenia"            
end

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str20(abbr name)
"AD"                   "Andorra"            
"Andorra"              "United Arab Emirates"
"AE"                   "Afghanistan"        
"United Arab Emirates" "Antigua and Barbuda"
"AF"                   "Anguilla"            
"Afghanistan"          "Albania"            
"AG"                   "Armenia"            
"Antigua and Barbuda"  ""                    
"AI"                   ""                    
"Anguilla"             ""                    
"AL"                   ""                    
"Albania"              ""                    
"AM"                   ""                    
"Armenia"              ""                    
end

Last edited by Chen Samulsion; 19 Aug 2019, 22:20.

Tags: None

Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#2

19 Aug 2019, 23:07

I don't have a facile solution to your intermediate step, but I am curious about that practical purpose.

Sometimes when I find that my tentative path to an ultimate data-management objective requires my jumping through some strange hoops, there are better and more direct solutions that occur to me when I pause, stand back and look at it a little later.
Comment
Chen Samulsion

Join Date: Jan 2018

Posts: 926
#3

20 Aug 2019, 01:09

Joseph Coveney, thank you. As to my practical purpose, I am now writing a coding file, its coding rule is as follows. Code name is start with an asterisk (*), and code content is placed in next line. Usually, code content is as same as code name albeit the latter has a asterisk in its head. In practice, generating code content equals code name is a convenient way. But the problem is to insert code content into rows of code name, just as the question I posed here. Even if the practice is not the case, the above strategy is useful, because people may provide a tabulation with code name in first column and code content (condition) in second column, thus I should also make similar manipulation in order to meet the coding rule above.

Code:

*man man *woman woman *girl girl *boy boy
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10218

20 Aug 2019, 06:06

Looks like a simple reshape to me.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 abbr str20 name
"AD" "Andorra"            
"AE" "United Arab Emirates"
"AF" "Afghanistan"        
"AG" "Antigua and Barbuda"
"AI" "Anguilla"            
"AL" "Albania"            
"AM" "Armenia"            
end

rename (*) (var#), addnumber(1)
gen id=_n
reshape long var, i(id) j(which)
sort id which

Code:

. l, sepby(id)

     +-----------------------------------+
     | id   which                    var |
     |-----------------------------------|
  1. |  1       1                     AD |
  2. |  1       2                Andorra |
     |-----------------------------------|
  3. |  2       1                     AE |
  4. |  2       2   United Arab Emirates |
     |-----------------------------------|
  5. |  3       1                     AF |
  6. |  3       2            Afghanistan |
     |-----------------------------------|
  7. |  4       1                     AG |
  8. |  4       2    Antigua and Barbuda |
     |-----------------------------------|
  9. |  5       1                     AI |
 10. |  5       2               Anguilla |
     |-----------------------------------|
 11. |  6       1                     AL |
 12. |  6       2                Albania |
     |-----------------------------------|
 13. |  7       1                     AM |
 14. |  7       2                Armenia |
     +-----------------------------------+

Comment

Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#5

20 Aug 2019, 06:15

Originally posted by Andrew Musau View Post

Looks like a simple reshape to me.

Yeah, if you don't want the second column shown in #1.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10218

20 Aug 2019, 06:24

If it is useful...

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str3 abbr str20 name
"AD" "Andorra"            
"AE" "United Arab Emirates"
"AF" "Afghanistan"        
"AG" "Antigua and Barbuda"
"AI" "Anguilla"            
"AL" "Albania"            
"AM" "Armenia"            
end

preserve
drop abbr
gen id2=_n
tempfile name
save `name'
restore
rename (*) (abbr#), addnumber(1)
gen id=_n
reshape long abbr, i(id) j(which)
sort id which
gen id2=_n
merge 1:1 id2 using `name'
drop _merge id* which

Code:

. l, sep(20)

     +---------------------------------------------+
     |                 abbr                   name |
     |---------------------------------------------|
  1. |                   AD                Andorra |
  2. |              Andorra   United Arab Emirates |
  3. |                   AE            Afghanistan |
  4. | United Arab Emirates    Antigua and Barbuda |
  5. |                   AF               Anguilla |
  6. |          Afghanistan                Albania |
  7. |                   AG                Armenia |
  8. |  Antigua and Barbuda                        |
  9. |                   AI                        |
 10. |             Anguilla                        |
 11. |                   AL                        |
 12. |              Albania                        |
 13. |                   AM                        |
 14. |              Armenia                        |
     +---------------------------------------------+

Comment

Chen Samulsion

Join Date: Jan 2018

Posts: 926
#7

20 Aug 2019, 06:45

Andrew Musau thank you very much. You provide an excellent solution! Answer in #4 is enough for my purpose. I have not used -reshape- for a long time, I really forgot to resort to the powerful -reshape- command.
Comment

Announcement

How to insert variable A's value as new rows of variable B

Comment

Comment

Comment

Comment

Comment

Comment