Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • how can I create all pairwise combination of many variables?

    Dear all,

    I have this data

    code:

    clear
    input str6(var1 var2 var3 var4 var5)
    "A01B1" "A01C05" "A01C11" "A01M21" "E02F3"
    "A01H11" "B25D1" "E23P15" "" ""
    end

    and i want to create all possible pairwises combinations in order to get this:

    code:
    clear
    input str6 (var1 var2 var3 var4 var5) str200 combine
    "A01B1" "A01C05" "A01C11" "A01M21" "E02F3" "A01B1.A01C5;A01B1.A01C11;A01B1.A01M21; A01B1.E02F3;A01C05.A01C11;A01C05.A01M21;A01C05.E02 F3;A01C11.A01M21;A01C05.E02F3;A01C11.A01M21;A01C11 .E02F3;A01M21.E02F3"
    "A01H11" "B25D1" "E23P15" "" "" "A01H11.B25D1;A01H11.E23P15;B25D1.E23P15"
    end


    Thank you in advance for the suggestions.

  • #2
    Jia:
    -pwcorr- will not work with -string- data.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      Thank you for your attention. Yes, you are right. The difficulty is about string.

      Therefore, is it possible for string data to do something like combinations among variables?

      thanks.

      Comment


      • #4
        Jia:
        not that I know.
        I would consider transforming your -string- data into numeric values and then go -pwcorr-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          I searched pwcorr, which seems like to calculate all pairwise correlation coefficients. However, what I need is combinations of variables. I'm sorry that I can't imagine how two connected.

          Would you please give me more hints to use this command to construct combinations of two elements?

          Thanks.

          Comment


          • #6
            Here's one way. It is difficult to see what you would want with a variable like the list of combinations but hope it helps anyway.
            Note this does not include double combinations, for example if A.B is included, B.A is not also included.

            Code:
            clear
            input str6(var1 var2 var3 var4 var5)
            "A01B1" "A01C05" "A01C11" "A01M21" "E02F3"
            "A01H11" "B25D1" "E23P15" "" ""
            end
            
            * Reshape long
            gen obs_id=_n
            reshape long var, i(obs_id) j(code_no)
            
            * Make values to be combined
            forvalues number = 1(1)5{
                bys obs_id: gen var`number'=var[_n+`number'-1]
            }
            
            * Make combinations
            forvalues number = 2(1)5{
                gen comb`number'=var1+"."+var`number'+";" if var1!="" & var`number'!=""
            }
            
            * Concetanate
            gen comb_all=comb2+comb3+comb4+comb5
            
            * Cleanup
            drop var1-var5 comb2-comb5
            
            * Reshape back to wide
            reshape wide var comb_all, i(obs_id) j(code_no) 
            
            * Concetanate the combinations
            gen comb_all=comb_all1+comb_all2+comb_all3+comb_all4+comb_all5
            drop comb_all1 comb_all2 comb_all3 comb_all4 comb_all5

            Comment


            • #7
              Dear Jorrit,

              Thank you for your help. The solution is perfect.Many thanks.

              Comment


              • #8
                i want to measure geographic proximity between each firm investors. so first i have to determine all possible pairwise combinations between a firm investors. For example, if a firm has 3 investors, then there will be 3 possible combinations , 1-2, 1-3, 2-3. then i have to measure the distance between every pair using a formula. then add the 3 distance to calculate a firm investors proximity. how to do this on stat

                Comment


                • #9
                  I doubt that anyone can give you more than a general, rather vague, description of the process until you show an example of your data. While the general process of creating pairs is fairly simple, getting the data into shape for that can be tricky, or easy, depending on how the data is organized. So I recommend you post back showing a representative example of your data, using the -dataex- command to do so. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                  Comment


                  • #10
                    this is simple structure of the data
                    firm year investors
                    a 2020 1
                    a 2020 2
                    a 2020 3
                    b 2021 4
                    b 2021 5
                    b 2021 6
                    b 2021 7

                    Comment


                    • #11
                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input str1 firm int year byte investors
                      "a" 2020 1
                      "a" 2020 2
                      "a" 2020 3
                      "b" 2021 4
                      "b" 2021 5
                      "b" 2021 6
                      "b" 2021 7
                      end
                      
                      isid firm year investor, sort
                      
                      tempfile copy
                      save `copy'
                      
                      rename investors investor1
                      joinby firm year using `copy'
                      rename investors investor2
                      keep if investor1 < investor2
                      sort firm year investor1 investor2
                      In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                      Comment

                      Working...
                      X