Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Making new variable from the blanks of an existing variable

    Hello! First time poster here. I am trying to make a new variable out of the blanks of another variable. I want a variable called "solo" that is 1 for all the observations that don't have a value for "medicalgroup", a string variable. And "medicalgroup" doesn't have any missing variables, just embedded blanks. I tried the following:

    replace solo=1 if medical group == " "

    and it didn't make any changes in the "solo" column.

    Any advice would be well appreciated! Thank you.

    Khat

  • #2
    I want a variable called "solo" that is 1 for all the observations that don't have a value for "medicalgroup", a string variable. And "medicalgroup" doesn't have any missing variables, just embedded blanks.
    Sorry, I don't understand this.

    Rather than explaining, it would be better if you showed an example of your data that covers the various cases, and then indicated which observations should have solo = 1, and which should have it as 0. Be sure to use the -dataex- command to show your example. READ FAQ #12 if you do not already have and know how to use the -dataex- command.

    Comment


    • #3
      Khat:
      welcome to the list.
      I do share Clyde's being puzzled about your query.
      First off, you might have obtained no change because -medical group- should have been coded -medical_group-, otherwise Stata considers -medical- and -group- as separate variables and, hence, an illegal syntax (but if you do not post what Stata gave you back, as per Clyde's wise advice, how can interested listers help you out on that?).
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Like others, I have no precise idea what is going on without a data example but offer the remark that empty "" and space " " are quite different. To a user empty strings and one or more spaces are usually equivalent. You can capture both with

        Code:
        if trim(group) == ""

        Comment


        • #5
          My guess is that where the poster writes

          And "medicalgroup" doesn't have any missing variables, just embedded blanks.
          the expectation is that a missing value would present as a period, which is not the case for string variables: the empty string is probably what is in the data.

          Nick's robust code should solve the problem in any case.

          Comment


          • #6
            Hello,

            Thank you for the suggestions and for telling me about the -dataex- command.
            I'm trying to generate a variable, "solo" that says 1 wherever "medicalgroup" and "code5" are blank.
            Here is some of my code and data (using other variables now, in addition to "medicalgroup"), I hope the formatting isn't too noisy--

            . gen solo=.
            (118,697 missing values generated)

            . replace solo=1 if size == 1
            (16,857 real changes made)

            . replace solo=1 if medicalgroup==" "
            (0 real changes made)

            . replace solo=1 if code5==" "
            (0 real changes made)

            . replace solo=1 if code5== " "
            (0 real changes made)

            . replace solo=1 if code5==" "
            (0 real changes made)


            [CODE]
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float solo str3(medicalgroup code5) int size
            1 "" "" 4
            . "" "BAZ" 2
            . "" "FAN" 4
            . "" "NEM" 14
            . "" "UAB" 17
            . "" "MAL" 8
            . "" "" 2
            . "" "BTD" 18
            . "" "KAX" 2
            . "" "" 2
            . "" "" 14
            . "" "BSN" 28
            1 "" "FKN" 1
            . "" "CAH" 5
            . "" "" 2
            1 "" "" 1
            . "" "" 2
            1 "" "" 1
            . "" "BEF" 7
            . "" "" 2
            . "" "" 9
            . "" "" 4
            1 "" "" 1
            . "" "" 2
            . "" "" 9
            . "" "BET" 10
            . "" "SSD" 18
            . "" "UAB" 19
            . "" "" 6
            . "" "" 3
            . "" "BWH" 58
            . "" "" 7
            . "" "LDH" 2
            . "" "CHL" 48
            . "" "" 7
            . "" "" 7
            . "" "" 23
            1 "" "" 1
            1 "" "BTD" 1
            . "" "" 39
            . "" "CHL" 18
            1 "" "" 1
            1 "" "" 1
            1 "" "" 1
            1 "" "" 1
            . "" "" 30
            . "" "" 18
            1 "" "" 1
            . "" "UAB" 22
            . "" "BET" 11
            . "" "NEM" 6
            . "" "NEM" 13
            . "" "" 5
            . "" "BET" 8
            . "" "" 18
            . "" "" 13
            . "" "" 18
            . "" "" 5
            . "" "" 2
            . "" "WC6" 7
            . "" "" 3
            . "" "LDA" 18
            1 "" "" 1
            1 "" "" 1

            Comment


            • #7
              Hello. In your data example, you do not need that many observations; but perhaps you could add spaces between the variables to that reading is better possible. and add the closing [/CODE] tag at the end.

              Comment


              • #8
                Nick's advice in post #4 is totally correct. Everywhere that you have
                Code:
                if code5==" "
                you should instead have
                Code:
                if code5==""
                or
                Code:
                if trim(code5)==""
                Note that in the corrected versions the two quotation marks have no intervening blank character as they do in the original version. As Nick points out, there is a difference between an empty string variable, and a string variable containing one blank character - and for that matter, they both differ from a string variable containing two blank characters.

                Looking at your dataex output makes it clear that in your data, the strings in question are empty and have no blank character.
                Code:
                1 "" "" 4
                . "" "BAZ" 2

                Comment


                • #9
                  I tried that without the space, and it worked. Thank you so much!

                  Comment

                  Working...
                  X