Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeat Strings in Irregular Pattern in a Column in Stata

    Hello, I have a small dataset like this,
    I want to repeat the number in the following ways:
    1) if the consecutive numbers are almost the same except for the "-" sign, then we need to use, for example, F-000073 to overwrite the second one" F000073".
    2) if there is no similar number, just keep it and thus we don't need to repeat it in the rule above.
    3) if the consecutive numbers are the same, we don't need to repeat. (of course, we can repeat it, but the results are the same)
    Thank you for your help!


    clear
    input str60 number
    F-000073
    F000073
    F-000509
    F000509
    F-000675
    F000675
    F-000703
    F000703
    F-000706
    F-000709
    F000709
    F-000713
    F000713
    F-000744
    F000744
    F-000746
    F000746
    F-000748
    F000748
    F-000750
    F000750
    F-000762
    F000762
    F-000781
    F000781
    F-000782
    F000782
    F-000783
    F000783
    F-000784
    F000784
    F-000788
    F000788
    F-000803
    F000803
    F-000806
    F000806
    F-000811
    F000811
    F-000813
    F000813
    F-000817
    F000817
    F-000831
    F000831
    F-000832
    F000832
    F-000833
    F000833
    F-000834
    F000834
    F-000835
    F000835
    F-000837
    F000837
    F-000838
    F-000838
    F-000SAL
    F-000SAL
    F-002005
    W36GMUF
    W36GMUF
    W36GTOP
    W36GTOP
    D-000180
    end
    Last edited by smith Jason; 24 May 2022, 12:41.

  • #2
    Code:
    gen sort = ustrregexra(number,"-","")
    sort sort number
    replace number = ustrregexra(number,"^([A-Z])","$1-") if number[_n-1] == ustrregexra(number,"^([A-Z])","$1-")
    drop sort

    Comment


    • #3
      Thank you!

      Comment

      Working...
      X