Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining variables

    I am looking to combine two variables

    state district

    and I typed Wwith the response:

    . joinby state district,gen(statedist) using(O:\fentanyl.dta)
    using required
    r(100);

    . help combine

    . help append

    . dataex

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str30 name str12 state1 str2 fentanyl1 int votefentanyl2 str10 fentanyl3 str12 stste str13 district long(votefentanyl1 votefentanyl3)
    "ADAMS, Alma"                 "AL" "6" 6 "6" "AL" "12" 3 3
    "ADERHOLT, Robert"            "AL" "1" 1 "1" "NC" "12" 2 2
    "AGUILAR, Peter Rey"          "CA" "6" 6 "6" "NC" "12" 3 3
    "ALFORD, Mark"                "MO" "1" 1 "1" "AL" "4"  2 2
    "ALLEN, Rick W."              "GA" "1" 1 "1" "AL" "4"  2 2
    "ALLRED, Colin"               "TX" "6" 0 "0" "CA" "33" 3 1
    "AMO, Gabe"                   "RI" "0" 6 "6" "CA" "33" 1 3
    "AMODEI, Mark E."             "NV" "1" 1 "1" "MO" "4"  2 2
    "ANSARI, Yassamin"            "AZ" "0" 6 "6" "MO" "4"  1 3
    "ARMSTRONG, Kelly"            "ND" "1" 0 "0" "GA" "12" 2 1
    "ARRINGTON, Jodey Cook"       "TX" "1" 1 "1" "GA" "12" 2 2
    "AUCHINCLOSS, Jake"           "MA" "9" 6 "6" "TX" "32" 4 3
    "BABIN, Brian"                "TX" "1" 1 "1" "RI" "1"  2 2
    "BACON, Donald J."            "NE" "1" 1 "1" "NV" "2"  2 2
    "BAIRD, James"                "IN" "1" 1 "1" "NV" "2"  2 2
    "BALDERSON, Troy"             "OH" "1" 1 "1" "AZ" "3"  2 2
    "BALINT, Becca"               "VT" "6" 6 "6" "ND" "1"  3 3
    "BANKS, James E."             "IN" "1" 0 "0" "TX" "19" 2 1
    "BARR, Garland H. (Andy) IV"  "KY" "1" 1 "1" "TX" "19" 2 2
    "BARRAGÁN, Nanette Diaz"     "CA" "6" 6 "6" "MA" "4"  3 3
    "BARRETT, Tom"                "MI" "0" 1 "1" "MA" "4"  1 2
    "BAUMGARTNER, Michael"        "WA" "0" 1 "9" "TX" "36" 1 4
    "BEAN, Aaron"                 "FL" "1" 1 "1" "TX" "36" 2 2
    "BEATTY, Joyce"               "OH" "6" 6 "6" "NE" "2"  3 3
    "BEGICH, Nicholas J., III"    "AK" "0" 1 "1" "NE" "2"  1 2
    "BELL, Wesley"                "MO" "0" 6 "6" "IN" "4"  1 3
    "BENTZ, Cliff"                "OR" "1" 1 "1" "IN" "4"  2 2
    "BERA, Ami"                   "CA" "1" 6 "6" "OH" "12" 2 3
    "BERGMAN, John"               "MI" "1" 1 "1" "OH" "12" 2 2
    "BEYER, Donald Sternoff Jr."  "VA" "9" 6 "6" "VT" "1"  4 3
    "BICE, Stephanie I."          "OK" "1" 1 "1" "VT" "1"  2 2
    "BIGGS, Andrew S."            "AZ" "1" 1 "1" "IN" "3"  2 2
    "BIGGS, Sheri"                "SC" "0" 1 "1" "KY" "6"  1 2
    "BILIRAKIS, Gus"              "FL" "1" 1 "1" "KY" "6"  2 2
    "BISHOP, Dan"                 "NC" "1" 0 "0" "CA" "44" 2 1
    "BISHOP, Sanford Dixon, Jr."  "GA" "1" 6 "6" "CA" "44" 2 3
    "BLUMENAUER, Earl"            "OR" "6" 0 "0" "MI" "7"  3 1
    "BLUNT ROCHESTER, Lisa"       "DE" "6" 0 "0" "WA" "5"  3 1
    "BOEBERT, Lauren"             "CO" "1" 1 "1" "FL" "4"  2 2
    "BONAMICI, Suzanne"           "OR" "6" 6 "6" "FL" "4"  3 3
    "BOST, Mike"                  "IL" "1" 1 "1" "OH" "3"  2 2
    "BOWMAN, Jamaal"              "NY" "6" 0 "0" "OH" "3"  3 1
    "BOYLE, Brendan Francis"      "PA" "1" 6 "6" "AK" "1"  2 3
    "BRECHEEN, Josh"              "OK" "1" 1 "1" "MO" "1"  2 2
    "BRESNAHAN, Robert P., Jr."   "PA" "0" 1 "1" "OR" "2"  1 2
    "BROWN, Shontel  M."          "OH" "6" 6 "6" "OR" "2"  3 3
    "BROWNLEY, Julia"             "CA" "6" 6 "6" "CA" "6"  3 3
    "BUCHANAN, Vernon G."         "FL" "1" 1 "1" "CA" "6"  2 2
    "BUCK, Kenneth Robert"        "CO" "1" 0 "0" "MI" "1"  2 1
    "BUCSHON, Larry"              "IN" "1" 0 "0" "MI" "1"  2 1
    "BUDZINSKI, Nikki"            "IL" "6" 6 "6" "VA" "8"  3 3
    "BURCHETT, Timothy"           "TN" "1" 1 "1" "VA" "8"  2 2
    "BURGESS, Michael C."         "TX" "1" 0 "0" "OK" "5"  2 1
    "BURLISON, Eric"              "MO" "1" 1 "9" "OK" "5"  2 4
    "BUSH, Cori"                  "MO" "6" 0 "0" "AZ" "5"  3 1
    "BYNUM, Janelle S."           "OR" "0" 6 "6" "AZ" "5"  1 3
    "CALVERT, Ken"                "CA" "1" 1 "1" "SC" "3"  2 2
    "CAMMACK, Kat"                "FL" "1" 1 "1" "FL" "12" 2 2
    "CARAVEO, Yadira"             "CO" "1" 0 "0" "FL" "12" 2 1
    "CARBAJAL, Salud"             "CA" "1" 6 "6" "NC" "8"  2 3
    "CÁRDENAS, Tony"             "CA" "6" 0 "0" "GA" "2"  3 1
    "CAREY, Mike"                 "OH" "1" 1 "1" "GA" "2"  2 2
    "CARL, Jerry L."              "AL" "1" 0 "0" "OR" "3"  2 1
    "CARSON, André"              "IN" "6" 6 "6" "DE" "1"  3 3
    "CARTER, Buddy"               "GA" "1" 1 "1" "CO" "4"  2 2
    "CARTER, John R."             "TX" "1" 1 "1" "CO" "3"  2 2
    "CARTER, Troy"                "LA" "6" 6 "6" "OR" "1"  3 3
    "CARTWRIGHT, Matt"            "PA" "1" 0 "0" "OR" "1"  2 1
    "CASAR, Greg"                 "TX" "6" 6 "6" "IL" "12" 3 3
    "CASE, Ed"                    "HI" "1" 6 "6" "IL" "12" 2 3
    "CASTEN, Sean"                "IL" "6" 6 "6" "NY" "16" 3 3
    "CASTOR, Kathy"               "FL" "6" 6 "6" "PA" "2"  3 3
    "CASTRO, Joaquin"             "TX" "6" 6 "6" "PA" "2"  3 3
    "CHAVEZ-DEREMER, Lori"        "OR" "1" 0 "0" "OK" "2"  2 1
    "CHERFILUS-MCCORMICK, Sheila" "FL" "6" 6 "6" "OK" "2"  3 3
    "CHU, Judy"                   "CA" "6" 6 "6" "PA" "8"  3 3
    "CICILLINE, David N."         "RI" "9" 0 "0" "OH" "11" 4 1
    "CISCOMANI, Juan"             "AZ" "9" 9 "1" "OH" "11" 4 2
    "CISNEROS, Gil"               "CA" "0" 6 "6" "CA" "26" 1 3
    "CLARK, Katherine M."         "MA" "6" 6 "6" "CA" "26" 3 3
    "CLARKE, Yvette Diane"        "NY" "6" 6 "6" "FL" "16" 3 3
    "CLEAVER, Emanuel, II"        "MO" "6" 6 "6" "FL" "16" 3 3
    "CLINE, Benjamin"             "VA" "1" 1 "1" "CO" "4"  2 2
    "CLOUD, Michael"              "TX" "1" 1 "1" "IN" "8"  2 2
    "CLYBURN, James Enos"         "SC" "6" 6 "6" "IL" "13" 3 3
    "CLYDE, Andrew S."            "GA" "1" 1 "1" "IL" "13" 2 2
    "COHEN, Stephen"              "TN" "6" 6 "6" "TN" "2"  3 3
    "COLE, Tom"                   "OK" "1" 1 "1" "TN" "2"  2 2
    "COLLINS, Mike"               "GA" "1" 1 "1" "TX" "26" 2 2
    "COMER, James"                "KY" "1" 1 "1" "MO" "7"  2 2
    "CONAWAY, Herbert C., Jr."    "NJ" "0" 6 "6" "MO" "7"  1 3
    "CONNOLLY, Gerald E. (Gerry)" "VA" "6" 6 "6" "MO" "1"  3 3
    "CORREA, Jose Luis"           "CA" "6" 6 "6" "OR" "5"  3 3
    "COSTA, Jim"                  "CA" "1" 6 "6" "CA" "41" 2 3
    "COURTNEY, Joe"               "CT" "1" 6 "6" "CA" "41" 2 3
    "CRAIG, Angela"               "MN" "1" 6 "6" "FL" "3"  2 3
    "CRANE, Eli"                  "AZ" "1" 1 "1" "FL" "3"  2 2
    "CRANK, Jeff"                 "CO" "0" 1 "1" "CO" "8"  1 2
    "CRAWFORD, Rick"              "AR" "1" 1 "1" "CA" "24" 2 2
    "CRENSHAW, Daniel"            "TX" "1" 1 "1" "CA" "24" 2 2
    end
    label values votefentanyl1 votefentanyl1
    label def votefentanyl1 1 "0", modify
    label def votefentanyl1 2 "1", modify
    label def votefentanyl1 3 "6", modify
    label def votefentanyl1 4 "9", modify
    label values votefentanyl3 votefentanyl3
    label def votefentanyl3 1 "0", modify
    label def votefentanyl3 2 "1", modify
    label def votefentanyl3 3 "6", modify
    label def votefentanyl3 4 "9", modify
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 869 observations
    Use the count() option to list more

    If I can't use joinby (since I have no other dataset to join), is there a way to combine AL 12 to get AL-12?

    Thanks,

    Ric Uslaner.

  • #2
    Well, you have no variable in the data set named state, so no, you cannot do that. I assume you meant either the variable state1 or the variable stste (which looks to me like a typo for state). Anyway, assuming you meant state1:
    Code:
    egen wanted = concat(state1 district), punct("-")
    Of course, make the obvious change to that if you wanted to use stste instead of state1.

    By the way, even if you had state in one data set and district in another, -joinby- would not accomplish this task (nor would -merge-). -joinby- would combine the data sets "side by side" but the variables in the two data sets would be left intact. You would still need -egen, concat()- to make one variable out of the two after -joinby- ran.

    Comment


    • #3
      See also

      Code:
      SJ-7-4  dm0034  . . . Stata tip 52: Generating composite categorical variables
              . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
              Q4/07   SJ 7(4):582--583                                 (no commands)
              tip on how to generate categorical variables using
              tostring and egen, group()

      Comment


      • #4
        Even simpler than #2 (at least to my mind) -- all you need is
        Code:
        gen wanted = state1 + "-" + district
        The egen command in #2 can be more powerful, but its power is harnessed only when you are combining (many?) more than two string variables.
        Last edited by Hemanshu Kumar; 07 Feb 2026, 23:01.

        Comment


        • #5
          Thanks Hemanshu and Clyde,

          Ric

          Comment


          • #6
            I tried Hemanshu's suggestion as such with one of the datasets:

            . dataex statenew1 cd in 1/20

            ----------------------- copy starting from the next line -----------------------
            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float statenew1 long cd
            2 3
            2 3
            2 3
            2 3
            2 3
            2 3
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            end
            label values cd cd
            label def cd 3 "AK01", modify
            label def cd 4 "AL01", modify
            ------------------ copy up to and including the previous line ------------------

            Listed 20 out of 54280 observations

            . dataex statenew1 cd in 1/40

            ----------------------- copy starting from the next line -----------------------
            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float statenew1 long cd
            2 3
            2 3
            2 3
            2 3
            2 3
            2 3
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 4
            1 5
            end
            label values cd cd
            label def cd 3 "AK01", modify
            label def cd 4 "AL01", modify
            label def cd 5 "AL02", modify
            ------------------ copy up to and including the previous line ------------------

            Listed 40 out of 54280 observations

            statenew1 cd are both numeric variables
            . des cd statenew1

            storage display value
            variable name type format label variable label
            ---------------------------------------------------------------------------------------------------------------------------
            cd long %8.0g cd State and Congressional District- Postal Abbrev and CD
            statenew1 float %9.0g

            but when I type

            . gen cdnewall = statenew1 + "-" + cd

            I get:
            type mismatch
            r(109);

            why can't I add statenew1 and cd?

            Thanks for any help

            Ric Uslaner
            .


            Comment


            • #7
              In #1, your original state and district variables were string variables. In the example you show in #6, statenew1 and cd are numeric variables, not strings. cd looks like a string because it has attached value labels. But numeric it is, and string it is. When doing calculations, Stata does not care about the value labels--everything is done with the underlying numbers. So the type mismatch results from trying to combine statenew1 and cd (numeric) with "-", a string. If you want to follow Hemanshu Kumar 's approach, you can do it as:

              Code:
              decode cd, gen(cd_string)
              gen wanted = string(statenew1) + "-" + cd_strin
              Or, you can do it in one line as
              Code:
              egen wanted = concat(statenew1 cd), punct("-") decode

              Comment


              • #8
                All the answers are right here, and complementary.

                Hemanshu Kumar is right that putting two string variables is essentially just addition with a small extra twist of whatever punctuation you might want to add in between.

                egen, concat() was originally written to be more versatile as is sometimes desirable, (a) to concatenate two or more variables (b) to allow punctuation to be specified just once (c) to allow on the fly conversion of numeric values (d) to allow use of value labels.

                Stata is usually right. or at least trying to act in your own best interests, in regarding attempts to add a string to a number as errors to be flagged.

                Comment

                Working...
                X