Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • response data separate by comma in one column

    I am attempting to generate frequency tables. For one survey question, it asked to "Choose two options" out of 10 options. And the data is now store in one column, with separate by comma as can see below. I am hoping for frequency table to look at these responses all together with percentage equal 100% but am having issues with the comma. Thank you for your help.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str5 jobdesc
    "2"    
    "6,7"  
    "1,5"  
    "5,6"  
    "3,6"  
    "1,5"  
    "6,10" 
    "1,6"  
    "2,7"  
    "1,4"  
    "2,4"  
    "5,6"  
    "5,6"  
    "5,6"  
    "1"    
    "1,3"  
    "1,4"  
    "1,6"  
    "1,4"  
    "1,4"  
    "5,7"  
    "1,5"  
    "1"    
    "4,5"  
    "5,6"  
    "5,7"  
    "1,6"  
    "1,5"  
    "1,3"  
    "1,8"  
    "1,5"  
    "1,2"  
    "1,10" 
    "10,11"
    "1,3"  
    "1,6"  
    "4"    
    "1,5"  
    "1"    
    "3,7"  
    "1,6"  
    "5,6"  
    "1,5"  
    "1,2"  
    "1,6"  
    "5,6"  
    "1,6"  
    "1,5"  
    "1,5"  
    "1,7"  
    "1,6"  
    "3,9"  
    "2,3"  
    "1,5"  
    "5"    
    "4,5"  
    "2,3"  
    "5"    
    "1,6"  
    "4,5"  
    "2"    
    "6,7"  
    "2,6"  
    "2,7"  
    "1,5"  
    "5,9"  
    "1"    
    "5"    
    "1,5"  
    "3,6"  
    "1,2"  
    "1,6"  
    "1,6"  
    "3,6"  
    "5"    
    "2,3"  
    "1,5"  
    "2,3"  
    "4"    
    "5,8"  
    "2,3"  
    "1,5"  
    "1,5"  
    "1,3"  
    "1,5"  
    "1,7"  
    "3,5"  
    "5"    
    "3,6"  
    "3,7"  
    "4,6"  
    "4,5"  
    "4,7"  
    "1,5"  
    "1,5"  
    "1,5"  
    "2,5"  
    "3"    
    "5"    
    "1"    
    end

  • #2
    Code:
    gen choice1 = real(ustrregexs(1)) if ustrregexm(jobdesc ,"([0-9]+),")
    gen choice2 = real(ustrregexs(1)) if ustrregexm(jobdesc,",([^,]+$)")
    replace choice1= real(jobdesc) if length(jobdesc)<3
    Last edited by Andrew Musau; 25 Jul 2019, 16:59.

    Comment


    • #3
      Very much appreciated for this.

      This code splits them into two separate columns, I believe? Main issue I am having stems from wanting to -tab- or -fre- them all together in one output table, due to their being responses to the same question. Is there a way to do this now with the two separate columns of responses?

      Comment


      • #4
        Consider tabsplit in the package tab_chi (SSC). With your example data (thanks) i get this

        Code:
        . tabsplit jobdesc , p(,)
        
            jobdesc |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  1 |         50       27.17       27.17
                 10 |          3        1.63       28.80
                 11 |          1        0.54       29.35
                  2 |         15        8.15       37.50
                  3 |         18        9.78       47.28
                  4 |         13        7.07       54.35
                  5 |         42       22.83       77.17
                  6 |         27       14.67       91.85
                  7 |         11        5.98       97.83
                  8 |          2        1.09       98.91
                  9 |          2        1.09      100.00
        ------------+-----------------------------------
              Total |        184      100.00
        so perhaps there should be a destring option, so that values are sorted numerically. Alternatively, consider this

        Code:
         
        . split jobdesc, p(,) destring
        variables born as string: 
        jobdesc1  jobdesc2
        jobdesc1: all characters numeric; replaced as byte
        jobdesc2: all characters numeric; replaced as byte
        (16 missing values generated)
        
        . tabm jobdesc?, transpose
        
                   |       variable
            values |  jobdesc1   jobdesc2 |     Total
        -----------+----------------------+----------
                 1 |        50          0 |        50 
                 2 |        12          3 |        15 
                 3 |         9          9 |        18 
                 4 |         8          5 |        13 
                 5 |        17         25 |        42 
                 6 |         3         24 |        27 
                 7 |         0         11 |        11 
                 8 |         0          2 |         2 
                 9 |         0          2 |         2 
                10 |         1          2 |         3 
                11 |         0          1 |         1 
        -----------+----------------------+----------
             Total |       100         84 |       184
        where tabm is part of the same package, to be installed with

        Code:
        ssc install tab_chi
        tabsplit tabm tab_chi are thus revealed as otherwise unpredictable search terms for mentions on Statalist.

        Comment


        • #5
          This seems to work very well, thank you Nick. I had not used tab_chi package previous.

          Does tabm/tab_chi have a way to make that final table so only the "values" and "Total" columns are spit out, with a Percentages column as well? And now that it has "destring", can I attach labels to the number responses?
          Last edited by Todd Motiwalla; 26 Jul 2019, 08:20.

          Comment


          • #6
            This does not work well for me. I also tried with -tabsplit-

            . split r2, p(,) destring
            variables born as string:

            r21 r22 r23
            r21: all characters numeric; replaced as byte
            (171 missing values generated)
            r22: all characters numeric; replaced as byte
            (274 missing values generated)
            r23: all characters numeric; replaced as byte
            (323 missing values generated)

            The variable r2 remains strL type. I want to generate new binary variables for each value 1,2,3 of my variables.

            Comment

            Working...
            X