Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Define a categorical variable.

    Dear All, According to the following data (this example is taken from http://bbs.pinggu.org/thread-6277610-1-1.html), I'd like to create a categorical variable d, which is equal to 1 if b2a=rmax, 2 if b2b=rmax, 3 if b2c=rmax, and 4 if b2d=rmax.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(b2a b2b b2c b2d) float rmax
    100  0  0  0 100
      0 80 20  0  80
      0  0 60 40  60
     90  0 10  0  90
      0 10 20 70  70
    end
    label values b2a B2A
    label values b2b B2B
    label values b2c B2C
    label values b2d B2D
    The following is my code:
    Code:
    gen d = (b2a == rmax)
    replace d = 2 if b2b == rmax
    replace d = 3 if b2c == rmax
    replace d = 4 if b2d == rmax
    Any other suggestion?
    Last edited by River Huang; 16 Mar 2018, 02:51.
    Ho-Chuan (River) Huang
    Stata 17.0, MP(4)

  • #2
    What happens if two or more conditions are satisfied?

    Comment


    • #3
      Hi, Nick, I know your concern, but there is no such situation in the data set.

      Ho-Chuan (River) Huang
      Stata 17.0, MP(4)

      Comment


      • #4
        Experience tells me that in a real dataset anything can happen, also things that should not be possible. So robust code should at least check and return an error message when the impossible happens.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Dear Maarten, I can not agree more. My suggested procedure has been tested in a real data set (the percentage of shares held by different ownerships) by the person who asked this question. But I share with your concern in other (more general) situations.
          Ho-Chuan (River) Huang
          Stata 17.0, MP(4)

          Comment


          • #6
            What make you still unsatisfied with your code? If it is about 1-line coding, the following might have some small sense.

            1. If it could be sure that there is only one rmax among b2* in each observation:
            Code:
            gen d1=1*(b2a==rmax)+ 2*(b2b==rmax)+ 3*(b2c==rmax)+4*(b2d==rmax)
            2. If not:
            Code:
            gen d2=1*(b2a==rmax)*(b2b<rmax)*(b2c<rmax)*(b2d<rmax)+2*(b2b==rmax)*(b2c<rmax)*(b2d<rmax)+3*(b2c==rmax)*(b2d<rmax)+4*(b2d==rmax)

            Comment


            • #7
              Hi Romalpa, Yes, I'm looking for a more concise code, if any, and many thanks for your suggestions.
              Ho-Chuan (River) Huang
              Stata 17.0, MP(4)

              Comment

              Working...
              X