Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a list of unique "values" from different variables

    Hello,

    I am working with a dataset with the names of 20 different medications taken by more than 1,000 patients. I need to know the list of all the medications distributed in 20 string variables.

    For example:
    Person Medication1 Medication2 Medication3 Medication4 Medication5...
    id1 a b c
    id2 b d e
    id3 c a f g h
    Result: List of unique values
    a
    b
    c
    d
    e
    f
    g
    h
    I have checked previous forums on how to compute the number of different observations but unfortunately have not succeeded.
    http://www.stata.com/support/faqs/da...-observations/

    In particular, I have tried with the following:
    by medication1 medication2, sort: gen nvals= _n ==1
    count if nvals
    replace nvals=sum(nvals)
    replace nvals= nvals[_N]


    Any help would be greatly appreciated.

    Kind regards,
    Alejandra

  • #2
    Alejandra, this is not the appropriate section of the forum to get answered.
    Please re-post it into the general section to increase the probability ger replied.

    For your particular question : It is not very clear to me.
    What a, b, c etc. letters stand for? Are these the medication names? And if you know already that there is 20 medications, why do you need to recover the list from this way?
    Or a,b,c etc. is another information (e.g. amount taken) and medication1 to medication20 are the 20 medicament variables.
    In this case, the first step would be to do
    Code:
    tab medication1
    Which would list all values taken by a,b,c etc. for the variable medication1. If you haven't much distinct values for each medication, you could easily compile them by doing a tabulate for each variable.
    Otherwise, a reshape long command might help to have only one medication variable, and then do the above tab medication command on the single variable.

    See
    Code:
    help reshape
    Best,
    Charlie

    Comment


    • #3
      Hi Charlie, thanks a lot for your message.

      Let me try to explain better my question. All the patients in my study have been asked to include up to 20 medications they are currently taking. So I am not sure the number of "unique" medications in my dataset. The "a", "b", "c"... are the medication names.

      I think the reshape long command will help and then tab.

      Thanks for the advice Charlie.

      Kind regards,
      Alejandra

      Comment


      • #4
        Ok, this is clearer now, and yes what you should do is a reshape long then tab.


        Code:
        reshape long Medication, i(Person) j(meds)
        tab meds
        Best,
        Charlie

        Comment


        • #5
          Thanks a lot Charlie!

          Comment


          • #6
            You're welcome, but remember next time to post in the appropriate section (general).
            You'll get answered much faster.
            Best,
            Charlie

            Comment

            Working...
            X