Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to realize the functions like (gen sum) and (egen sum) for string variables by group?

    In the sample data, B is what I want to get from A (something like egen sum). I also want to genarate a variable as the sum of the first through jth observations on A (something like gen sum).
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str54 A float tempid str65 B
    "negAffect5"            1 "negAffect5 posAffect4 negAffect2 negAffect3"                      
    "posAffect4"            1 "negAffect5 posAffect4 negAffect2 negAffect3"                      
    "negAffect2 negAffect3" 1 "negAffect5 posAffect4 negAffect2 negAffect3"                      
    "negAffect5"            2 "negAffect5 posAffect4 negAffect2 negAffect3"                      
    "posAffect4"            2 "negAffect5 posAffect4 negAffect2 negAffect3"                      
    "negAffect2 negAffect3" 2 "negAffect5 posAffect4 negAffect2 negAffect3"                      
    "negAffect2 negAffect3" 3 "negAffect2 negAffect3 negAffect2 negAffect5 negAffect3 negAffect4"
    "negAffect2 negAffect5" 3 "negAffect2 negAffect3 negAffect2 negAffect5 negAffect3 negAffect4"
    "negAffect3 negAffect4" 3 "negAffect2 negAffect3 negAffect2 negAffect5 negAffect3 negAffect4"
    end

  • #2
    This is a good exercise in understanding how explicit subscripting plus -replace- work in Stata. Here:

    Code:
    . sort tempid
    
    . gen myB = A
    
    . by tempid: replace myB = myB[_n-1]+ " " +myB if _n>1
    variable myB was str43 now str65
    (6 real changes made)
    
    . by tempid: replace myB = myB[_N]
    (6 real changes made)
    
    . assert B==myB

    Comment


    • #3
      Fred Lee started a thread earlier https://www.statalist.org/forums/for...ariables-by-id

      He cross-referenced this one there, but should have mentioned that one here too.

      A puzzle remains, as what Fred says he wants is precisely the technique explained in the previous thread.

      A twist that may be needed is to remove repetitions:


      Code:
      clear
      set obs 1
      gen whatever = "A B C D E A B C D E"
      gen wc = wordcount(whatever)
      su wc, meanonly
      local max = r(max)
      gen wanted = word(whatever, 1)
      quietly forval  j = 2/`max' {
      replace wanted = wanted + " " + word(whatever, `j') if !strpos(wanted, word(whatever, `j'))
      }
      list
      EDIT: FIxed typo.
      Last edited by Nick Cox; 14 Mar 2021, 06:57.

      Comment


      • #4
        Nick forgot to add forv on line 8. Here is the modified code
        Code:
        clear
        set obs 1
        gen whatever = "A B C D E A B C D E"
        gen wc = wordcount(whatever)
        su wc, meanonly
        local max = r(max)
        gen wanted = word(whatever, 1)
        quietly forv j = 2 / `max' {
            replace wanted = wanted + " " + word(whatever, `j') if !strpos(wanted, word(whatever, `j'))
        }
        list
        Regards
        --------------------------------------------------
        Attaullah Shah, PhD.
        Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
        FinTechProfessor.com
        https://asdocx.com
        Check out my asdoc program, which sends outputs to MS Word.
        For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

        Comment


        • #5
          Attaullah Shah is naturally correct; thanks for the signal. (I copied and pasted the code, and decided to add quietly but the edit carelessly replaced rather than inserted.)

          Comment


          • #6
            Originally posted by Joro Kolev View Post
            This is a good exercise in understanding how explicit subscripting plus -replace- work in Stata. Here:

            Code:
            . sort tempid
            
            . gen myB = A
            
            . by tempid: replace myB = myB[_n-1]+ " " +myB if _n>1
            variable myB was str43 now str65
            (6 real changes made)
            
            . by tempid: replace myB = myB[_N]
            (6 real changes made)
            
            . assert B==myB
            Thanks Joro! This is exactly what I want.

            Comment


            • #7
              Originally posted by Nick Cox View Post
              Fred Lee started a thread earlier https://www.statalist.org/forums/for...ariables-by-id

              He cross-referenced this one there, but should have mentioned that one here too.

              A puzzle remains, as what Fred says he wants is precisely the technique explained in the previous thread.

              A twist that may be needed is to remove repetitions:


              Code:
              clear
              set obs 1
              gen whatever = "A B C D E A B C D E"
              gen wc = wordcount(whatever)
              su wc, meanonly
              local max = r(max)
              gen wanted = word(whatever, 1)
              quietly forval j = 2/`max' {
              replace wanted = wanted + " " + word(whatever, `j') if !strpos(wanted, word(whatever, `j'))
              }
              list
              EDIT: FIxed typo.
              Also thanks for removing repitions.
              The removement is more than what I want, thanks for the further step!

              Comment


              • #8
                Originally posted by Attaullah Shah View Post
                Nick forgot to add forv on line 8. Here is the modified code
                Code:
                clear
                set obs 1
                gen whatever = "A B C D E A B C D E"
                gen wc = wordcount(whatever)
                su wc, meanonly
                local max = r(max)
                gen wanted = word(whatever, 1)
                quietly forv j = 2 / `max' {
                replace wanted = wanted + " " + word(whatever, `j') if !strpos(wanted, word(whatever, `j'))
                }
                list
                Thanks so much for your correction!

                Comment

                Working...
                X