Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to insert variable A's value as new rows of variable B

    Dear Stata users,

    I have a question about data management. Data example is as below. Suppose we have two variable, -abbr- is abbreviation of country name, and the corresponding -name- is full name of country. For some practical purpose, I want to insert values of variable -name- into variable -abbr- as new rows. Thus, the result should be as the second code below. I cannot find a soultion by resorting to command -expand- or -stack-, is there some one would help me? Thank you.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str3 abbr str20 name
    "AD" "Andorra"            
    "AE" "United Arab Emirates"
    "AF" "Afghanistan"        
    "AG" "Antigua and Barbuda"
    "AI" "Anguilla"            
    "AL" "Albania"            
    "AM" "Armenia"            
    end
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str20(abbr name)
    "AD"                   "Andorra"            
    "Andorra"              "United Arab Emirates"
    "AE"                   "Afghanistan"        
    "United Arab Emirates" "Antigua and Barbuda"
    "AF"                   "Anguilla"            
    "Afghanistan"          "Albania"            
    "AG"                   "Armenia"            
    "Antigua and Barbuda"  ""                    
    "AI"                   ""                    
    "Anguilla"             ""                    
    "AL"                   ""                    
    "Albania"              ""                    
    "AM"                   ""                    
    "Armenia"              ""                    
    end
    Last edited by Chen Samulsion; 19 Aug 2019, 22:20.

  • #2
    I don't have a facile solution to your intermediate step, but I am curious about that practical purpose.

    Sometimes when I find that my tentative path to an ultimate data-management objective requires my jumping through some strange hoops, there are better and more direct solutions that occur to me when I pause, stand back and look at it a little later.

    Comment


    • #3
      Joseph Coveney, thank you. As to my practical purpose, I am now writing a coding file, its coding rule is as follows. Code name is start with an asterisk (*), and code content is placed in next line. Usually, code content is as same as code name albeit the latter has a asterisk in its head. In practice, generating code content equals code name is a convenient way. But the problem is to insert code content into rows of code name, just as the question I posed here. Even if the practice is not the case, the above strategy is useful, because people may provide a tabulation with code name in first column and code content (condition) in second column, thus I should also make similar manipulation in order to meet the coding rule above.
      Code:
      *man
      man
      *woman
      woman
      *girl
      girl
      *boy
      boy

      Comment


      • #4
        Looks like a simple reshape to me.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str3 abbr str20 name
        "AD" "Andorra"            
        "AE" "United Arab Emirates"
        "AF" "Afghanistan"        
        "AG" "Antigua and Barbuda"
        "AI" "Anguilla"            
        "AL" "Albania"            
        "AM" "Armenia"            
        end
        
        rename (*) (var#), addnumber(1)
        gen id=_n
        reshape long var, i(id) j(which)
        sort id which
        Code:
        . l, sepby(id)
        
             +-----------------------------------+
             | id   which                    var |
             |-----------------------------------|
          1. |  1       1                     AD |
          2. |  1       2                Andorra |
             |-----------------------------------|
          3. |  2       1                     AE |
          4. |  2       2   United Arab Emirates |
             |-----------------------------------|
          5. |  3       1                     AF |
          6. |  3       2            Afghanistan |
             |-----------------------------------|
          7. |  4       1                     AG |
          8. |  4       2    Antigua and Barbuda |
             |-----------------------------------|
          9. |  5       1                     AI |
         10. |  5       2               Anguilla |
             |-----------------------------------|
         11. |  6       1                     AL |
         12. |  6       2                Albania |
             |-----------------------------------|
         13. |  7       1                     AM |
         14. |  7       2                Armenia |
             +-----------------------------------+

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          Looks like a simple reshape to me.
          Yeah, if you don't want the second column shown in #1.

          Comment


          • #6
            If it is useful...

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str3 abbr str20 name
            "AD" "Andorra"            
            "AE" "United Arab Emirates"
            "AF" "Afghanistan"        
            "AG" "Antigua and Barbuda"
            "AI" "Anguilla"            
            "AL" "Albania"            
            "AM" "Armenia"            
            end
            
            preserve
            drop abbr
            gen id2=_n
            tempfile name
            save `name'
            restore
            rename (*) (abbr#), addnumber(1)
            gen id=_n
            reshape long abbr, i(id) j(which)
            sort id which
            gen id2=_n
            merge 1:1 id2 using `name'
            drop _merge id* which
            Code:
            . l, sep(20)
            
                 +---------------------------------------------+
                 |                 abbr                   name |
                 |---------------------------------------------|
              1. |                   AD                Andorra |
              2. |              Andorra   United Arab Emirates |
              3. |                   AE            Afghanistan |
              4. | United Arab Emirates    Antigua and Barbuda |
              5. |                   AF               Anguilla |
              6. |          Afghanistan                Albania |
              7. |                   AG                Armenia |
              8. |  Antigua and Barbuda                        |
              9. |                   AI                        |
             10. |             Anguilla                        |
             11. |                   AL                        |
             12. |              Albania                        |
             13. |                   AM                        |
             14. |              Armenia                        |
                 +---------------------------------------------+

            Comment


            • #7
              Andrew Musau thank you very much. You provide an excellent solution! Answer in #4 is enough for my purpose. I have not used -reshape- for a long time, I really forgot to resort to the powerful -reshape- command.

              Comment

              Working...
              X