Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Encode command with conditions

    Hi, I have a question about the "encode" command.

    my data is as below. I need to use the "cluster" in my pooled logit model command.
    but as you can see when I pooled the data together, some cluster numbers are duplicated even they represent different things in different countries.
    for example, "1" in Australia, EU, and the USA is the same cluster number, but in fact, it is representing different clusters respectively in each country. however, Australia-1's and Australis-2's "1" is representing the same cluster.
    how should I reencode them and present the result as a "new cluster".

    and my data sample is somehow very large. after I reencoded them, how should I check whether I did it right or not?

    Many thanks in advance.

    Click image for larger version

Name:	1.PNG
Views:	1
Size:	13.1 KB
ID:	1614880

  • #2
    you don't show your commands at all but I assume here that you actually did use the -encode- command but separately on each file prior to combining; however, that only works if you use the "label" option to ensure that everything is done consistently between files; another option is to not encode until after you have combined the files and then use -encode- only once; see
    Code:
    help encode

    Comment


    • #3
      Originally posted by Rich Goldstein View Post
      you don't show your commands at all, but I assume here that you actually did use the -encode- command but separately on each file before combining; however, that only works if you use the "label" option to ensure that everything is done consistently between files; another option is to not encode until after you have combined the files and then use -encode- only once; see
      Code:
      help encode
      Hi Rich, thanks for the prompt reply.

      The cluster number is given in each data set, so I didn't use encode command, and I can't even decode it.
      And regarding the "new cluster," I did it on Excel.
      Last edited by Jane Quan; 16 Jun 2021, 05:46.

      Comment


      • #4
        Why do USA-1 cluster 8 and USA-1 cluster 5 have the same value new cluster=12? Import the data to Stata and provide a data example using dataex, e.g., by copying and pasting the result of

        Code:
        dataex country cluster new_cluster in 1/20

        Comment


        • #5
          Originally posted by Andrew Musau View Post
          Why do USA-1 cluster 8 and USA-1 cluster 5 have the same value new cluster=12? Import the data to Stata and provide a data example using dataex, e.g., by copying and pasting the result of

          Code:
          dataex country cluster new_cluster in 1/20
          Hi Andrew,
          I am new to the Stata, sorry for the poor way of asking the question (which I didn't give you the data set).
          below are 50 samples of my data, and there is no "new_cluster", "new_cluster" is what I want to make based on country and cluster.
          so I made that on excel to explain the problem. And also you are right, I made mistake on "Why do USA-1 cluster 8 and USA-1 cluster 5 have the same value new cluster=12".

          country cluster
          "ET" 639
          "ET" 463
          "ET" 639
          "ET" 384
          "RW" 150
          "ET" 384
          "BF" 422
          "MZ" 202
          "UG" 619
          "NG" 1183
          "BJ" 206
          "SN" 194
          "TD" 303
          "AO" 171
          "SN" 220
          "UG" 130
          "CI" 217
          "MW" 773
          "RW" 324
          "NG" 217
          "NG" 263
          "CD" 74
          "UG" 230
          "RW" 253
          "MW" 321
          "GN" 216
          "BF" 384
          "NI" 251
          "SN" 398
          "RW" 306
          "BU" 283
          "MZ" 38
          "NG" 323
          "TD" 42
          "BU" 59
          "GN" 28
          "RW" 72
          "ZW" 84
          "AO" 216
          "NG" 1347
          "SN" 73
          "SN" 236
          "SN" 69
          "CD" 216
          "SN" 51
          "NG" 228
          "BU" 36
          "NM" 216
          "CD" 388
          "NG" 272
          end

          Comment


          • #6
            EDITED: If each cluster has to be different between countries, see "wanted2"

            Thanks for the data example.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input str5 country float cluster
            "ET"  639
            "ET"  463
            "ET"  639
            "ET"  384
            "RW"  150
            "ET"  384
            "BF"  422
            "MZ"  202
            "UG"  619
            "NG" 1183
            "BJ"  206
            "SN"  194
            "TD"  303
            "AO"  171
            "SN"  220
            "UG"  130
            "CI"  217
            "MW"  773
            "RW"  324
            "NG"  217
            "NG"  263
            "CD"   74
            "UG"  230
            "RW"  253
            "MW"  321
            "GN"  216
            "BF"  384
            "NI"  251
            "SN"  398
            "RW"  306
            "BU"  283
            "MZ"   38
            "NG"  323
            "TD"   42
            "BU"   59
            "GN"   28
            "RW"   72
            "ZW"   84
            "AO"  216
            "NG" 1347
            "SN"   73
            "SN"  236
            "SN"   69
            "CD"  216
            "SN"   51
            "NG"  228
            "BU"   36
            "NM"  216
            "CD"  388
            "NG"  272
            end
            
            bys country (cluster): gen wanted= sum(cluster!=cluster[_n-1])
            egen wanted2= group(country cluster)
            Res.:

            Code:
            
            . l, sepby(country)
            
                 +--------------------------------------+
                 | country   cluster   wanted   wanted2 |
                 |--------------------------------------|
              1. |      AO       171        1         1 |
              2. |      AO       216        2         2 |
                 |--------------------------------------|
              3. |      BF       384        1         3 |
              4. |      BF       422        2         4 |
                 |--------------------------------------|
              5. |      BJ       206        1         5 |
                 |--------------------------------------|
              6. |      BU        36        1         6 |
              7. |      BU        59        2         7 |
              8. |      BU       283        3         8 |
                 |--------------------------------------|
              9. |      CD        74        1         9 |
             10. |      CD       216        2        10 |
             11. |      CD       388        3        11 |
                 |--------------------------------------|
             12. |      CI       217        1        12 |
                 |--------------------------------------|
             13. |      ET       384        1        13 |
             14. |      ET       384        1        13 |
             15. |      ET       463        2        14 |
             16. |      ET       639        3        15 |
             17. |      ET       639        3        15 |
                 |--------------------------------------|
             18. |      GN        28        1        16 |
             19. |      GN       216        2        17 |
                 |--------------------------------------|
             20. |      MW       321        1        18 |
             21. |      MW       773        2        19 |
                 |--------------------------------------|
             22. |      MZ        38        1        20 |
             23. |      MZ       202        2        21 |
                 |--------------------------------------|
             24. |      NG       217        1        22 |
             25. |      NG       228        2        23 |
             26. |      NG       263        3        24 |
             27. |      NG       272        4        25 |
             28. |      NG       323        5        26 |
             29. |      NG      1183        6        27 |
             30. |      NG      1347        7        28 |
                 |--------------------------------------|
             31. |      NI       251        1        29 |
                 |--------------------------------------|
             32. |      NM       216        1        30 |
                 |--------------------------------------|
             33. |      RW        72        1        31 |
             34. |      RW       150        2        32 |
             35. |      RW       253        3        33 |
             36. |      RW       306        4        34 |
             37. |      RW       324        5        35 |
                 |--------------------------------------|
             38. |      SN        51        1        36 |
             39. |      SN        69        2        37 |
             40. |      SN        73        3        38 |
             41. |      SN       194        4        39 |
             42. |      SN       220        5        40 |
             43. |      SN       236        6        41 |
             44. |      SN       398        7        42 |
                 |--------------------------------------|
             45. |      TD        42        1        43 |
             46. |      TD       303        2        44 |
                 |--------------------------------------|
             47. |      UG       130        1        45 |
             48. |      UG       230        2        46 |
             49. |      UG       619        3        47 |
                 |--------------------------------------|
             50. |      ZW        84        1        48 |
                 +--------------------------------------+
            
            .
            Last edited by Andrew Musau; 16 Jun 2021, 07:17.

            Comment


            • #7
              Originally posted by Andrew Musau View Post
              EDITED: If each cluster has to be different between countries, see "wanted2"

              Thanks for the data example.

              [/CODE]
              Hi Andrew,

              This just perfectly solved my problem!
              Thank you so much!

              Have a good day^^

              Comment

              Working...
              X