Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Tab two string variables

    Hi everyone,

    This might be an easy question, but I just can't get to the right command.

    I have two string variables that I need to get an overview of. In one variable I have around 150 unique treatment codes (on ~100,000 patients) and in the other variable I have the label for each treatment code.
    What I need is an output containing two columns with each treatment code appearing only once in the left column and the label for that treatment code in the right column. If the frequency and percentage for each of the codes could be included that would be great (just like in a regular tab command with only one variable).

    I can't install additional programs.

    Thank you for your help.


  • #2
    Label in Stata could mean value label but here it seems you mean string value. Perhaps your set-up is like this


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str12 var1 str17 var2
    "short"        "longer"          
    "not so short" "much, much longer"
    "short"        "longer"          
    end

    If your strings are in one-to-one correspondence, then this should work:


    Code:
    . tabdisp var1, c(var2)
    
    --------------------------------
            var1 |              var2
    -------------+------------------
    not so short | much, much longer
           short |            longer
    --------------------------------

    For more flexibility, you might find groups useful. You need to install that before you can use it. The 2017 paper explains, but download the 2018 update of the software.

    Code:
     
    
    . groups var1 var2
    
      +----------------------------------------------------+
      |         var1                var2   Freq.   Percent |
      |----------------------------------------------------|
      | not so short   much, much longer       1     33.33 |
      |        short              longer       2     66.67 |
      +----------------------------------------------------+
    
    
    . search st0496, sj entry
    
    Search of official help files, FAQs, Examples, and Stata Journals
    
    SJ-18-1 st0496_1 . . . . . . . . . . . . . . . . . Software update for groups
    (help groups if installed) . . . . . . . . . . . . . . . . N. J. Cox
    Q1/18 SJ 18(1):291
    groups exited with an error message if weights were specified;
    this has been corrected
    
    SJ-17-3 st0496 . . . . . Speaking Stata: Tables as lists: The groups command
    (help groups if installed) . . . . . . . . . . . . . . . . N. J. Cox
    Q3/17 SJ 17(3):760--773
    presents command for listing group frequencies and percents and
    cumulations thereof; for various subsetting and ordering by
    frequencies, percents, and so on; for reordering of columns;
    and for saving tabulated data to new datasets
    Typing the search command above in Stata should give a clickable link to st0496_1
    Last edited by Nick Cox; 04 May 2021, 01:34.

    Comment


    • #3
      OP, you should put a sample of your data using -dataex- and explain what you want with reference to this actual data. We should not have to guess what you are having on your mind, and we should not have to make stuff up in response to what you might, or might not be asking.

      Otherwise what I think you want, can be easily done with something I call a List Table.

      If you have two variables var and varlabel, you do

      Code:
      egen tagvar = tag(var)
      
      list var varlabel if tagvar
      Frequencies can be similarly easily computed, but we need to know frequency of what with a reference to what.

      Comment


      • #4
        Dear Nick and Joro,

        Thank you very much for your suggestions. Both of them fulfilled what I wanted.
        I'll remember to use the dataex next time I need advice.

        Have a good day.

        Best

        Comment

        Working...
        X