Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • I need help in finding the correct command, help!

    Hello,
    I want to replace phrases in a string variable so I can transform them into numbers and numerical variable.
    I used the following command to replace:
    replace VARIABLE = "NUMBER" if VARIABLE == "TEXT"
    However, I have a long list of texts, so I want a command that makes the "if" condition function when it finds at least a specific word in the cell or starts with a specific word.

  • #2
    There are several radically different situations that are all compatible with your description, and they would require radically different solutions.

    Please post back showing an example of your data set, using the -dataex- command to do so. Also show what you want the final result to look like, because from your description it is not at all clear what the relationship between the numerical values and the text values is supposed to be and there are many possibilities.

    If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment


    • #3
      Thank you.

      I will try to make my question clearer because the command didn't work:

      I have a string variable with texts in the cells for example Car 1997 Car 2010 Car 2023 Bicycle 1997 Bicycle 2010 Bicycle 2023, so on and so forth

      I want to replace any cell that has the word or starts with the word "Car" by "Car" only and the word "Bicycle" by "Bicycle" only regardless of the year it is associated with

      Last I want to replace these texts with numbers 0, 1, 2... and then destring the variable

      Your help is very appreciated

      Comment


      • #4
        This may help. I do things two ways.

        Code:
        clear 
        input str12 whatever
        "Car 1997"
        "Car 2010"
        "Car 2023"
        "Bicycle 1997"
        "Bicycle 2010"
        "Bicycle 2023"
        "Walk 1952"
        end 
        
        gen wanted = cond(word(whatever, 1) == "Bicycle", 1, cond(word(whatever, 2) == "Car", 2, 0)) 
        
        gen WANTED = 0 
        replace WANTED = 1 if word(whatever, 1) == "Bicycle"
        replace WANTED = 2 if word(whatever, 1) == "Car"
        
        list, sepby(wanted)
        
             +--------------------------------+
             |     whatever   wanted   WANTED |
             |--------------------------------|
          1. |     Car 1997        0        2 |
          2. |     Car 2010        0        2 |
          3. |     Car 2023        0        2 |
             |--------------------------------|
          4. | Bicycle 1997        1        1 |
          5. | Bicycle 2010        1        1 |
          6. | Bicycle 2023        1        1 |
             |--------------------------------|
          7. |    Walk 1952        0        0 |
             +--------------------------------+

        Please read https://www.statalist.org/forums/help#stata (please give a data example; "didn't work" is not helpful as a problem report)

        https://www.statalist.org/forums/help#topiclines (please use informative titles)

        Comment

        Working...
        X