Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recoding into another variable

    I am at a total loss for what seems like a straightforward issue. I would be grateful for any help:

    I have a string variable "Other...please specify." Several respondents provided text responses to this "Other-specify" variable (e.g., "farming"). I want to move the values of the "Other-specify" variable to existing categories of another variable "Industry," which is numeric (e.g., 1- Agriculture). I have tried this:

    replace Industry = 1 if Otherspecify_TEXT=="farming"

    but got the message "(0 real changes made)."

    Any advice would be much appreciated!
    Last edited by Sarah Soroui; 10 Feb 2019, 01:19.

  • #2
    Sarah:
    welcome to this forum.
    In all likelihood, you have stray blanks in your -string- variables.
    In the folowing toy-example, a -string- variable with a trailing blank is created and, as expected, the first attemp to create a numerical variable assuming that the -string- variable has no trailing blanks, is unsuccessful:
    Code:
    . set obs 1
    number of observations (_N) was 0, now 1
    
    .
    . g Otherspecify_TEXT="farming "
    
    .
    . g Industry = 1 if Otherspecify_TEXT=="farming"
    (1 missing value generated)
    Deleting the trailing blank allows accomplishing what I'm after:
    Code:
    . replace Otherspecify_TEXT=strrtrim(Otherspecify_TEXT)
    (1 real change made)
    
    .
    . replace Industry = 1 if Otherspecify_TEXT=="farming"
    (1 real change made)
    
    . lis
    
         +---------------------+
         | Others~T   Industry |
         |---------------------|
      1. |  farming          1 |
         +---------------------+
    If you have leading (trailing) blanks in your -string- variable, see -help strltrim- (-help strrtrim-).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo, Thank you very much for your response. I tried the codes you provided but still got the error message (0 real changes made).
      I think I might have mucked this up by presenting an example of the problem I am dealing with versus the actual issue (maybe that's why it's not working)?

      This is the table of the "Otherspecify_TEXT" (which is called Q33_20_TEXT) variable I am using


      tab Q33_20_TEXT :
      Click image for larger version

Name:	pic1.png
Views:	3
Size:	11.9 KB
ID:	1482968







      I want to move these values to the Other category in the Industry variable (20-Other)

      I provided the labelbook code on the Industry variable in case it's helpful:


      Click image for larger version

Name:	image_13409.png
Views:	2
Size:	42.0 KB
ID:	1482975


      I tried applying the codes you provided, and got this response:

      replace Q33_20_TEXT =strrtrim(Q33_20_TEXT)
      (171 real changes made)

      replace Industry = 20 if Q33_20_TEXT=="non-profit"
      (0 real changes made)

      As an aside, I tried generating the labelbook for Q33_20_TEXT and got this error:
      value label Q33_20_TEXT not found
      r(111);

      Could that be the problem?

      Thank you so much for your help. I have been spinning my wheels on this for hours now.

      Many thanks,
      Sarah
      Attached Files
      Last edited by Sarah Soroui; 10 Feb 2019, 09:36.

      Comment


      • #4
        You might try replacing
        Code:
        strrtrim(Q33_20_TEXT)
        which trims blanks from the right end of the variable with
        Code:
        strtrim(Q33_20_TEXT)
        which trims blanks from both ends of the variable; perhaps there are leading blanks rather than (or in addition to) trailing blanks.

        Comment


        • #5
          Hi, William. Thank you very much for your response.

          So, unfortunately, this isn't working either.

          I tried:

          replace Q33_20_TEXT =strtrim(Q33_20_TEXT)
          (0 real changes made)


          . replace Industry = 20 if Q33_20_TEXT=="non-profit"
          (0 real changes made)

          is there any additional information that I can provide that might help resolve this?

          Comment


          • #6
            Perhaps some example data will help. Try running
            Code:
            dataex Q33_20_TEXT if Q33_20_TEXT != ""
            and copy the output of the command from Stata's Results window and paste it into a new post. Be sure to copy everthing starting with [CODE] and ending with [/CODE].

            Also, run
            Code:
            codebook Q33_20_TEXT
            and copy the output from Stata's results window and paste it into the new post.

            Connected with this suggestion is some advice for making effective use of Statalist. Take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.



            Comment


            • #7
              Hi William, thank you very much for helping me through this and for directing me to the FAQ sheet. Here is what I have:

              Code:
              .  ssc install dataex
              checking dataex consistency and verifying not already installed...
              installing into c:\ado\plus\...
              installation complete.
              Code: dataex Q33_20_TEXT if Q33_20_TEXT != ""

              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input strL Q33_20_TEXT
              "non-profit"                          
              "Philanthropic Organization/Foundation"
              "Environment and Conservation"        
              "land conservation trust"              
              end
              Code:
              . codebook Q33_20_TEXT
              
              ------------------------------------------------------------------------------------------------------
              Q33_20_TEXT           Using the following terms for industry groups, which one best describes your com
              ------------------------------------------------------------------------------------------------------
              
                                type:  string (strL)
              
                       unique values:  4                        missing "":  171/175
              
                          tabulation:  Freq.  Value
                                         171  ""
                                           1  "Environment and Conservation"
                                           1  "Philanthropic Organization/Foundation"
                                           1  "land conservation trust"
                                           1  "non-profit"
              
                             warning:  variable has embedded blanks
              
              .

              Last edited by Sarah Soroui; 10 Feb 2019, 11:53.

              Comment


              • #8
                Wow, I am totally at a loss, because when I read in the sample data you provided and apply your replace command, it works as expected, as shown below. Perhaps someone else will see what you and I are missing.

                Now that you're familiar with copying and pasting Stata commands and output (thank you!), let me suggest you run the following and then copy and paste the commands and output into a CODE block in a new post. That will confirm to the skeptical that you did leave out anything in your earlier posts.
                Code:
                codebook Q33_20_TEXT
                replace Industry = 20 if Q33_20_TEXT=="non-profit"
                And here are my results.
                Code:
                . clear
                
                . input strL Q33_20_TEXT
                
                     Q33_20_~T
                  1. "non-profit"                          
                  2. "Philanthropic Organization/Foundation"
                  3. "Environment and Conservation"        
                  4. "land conservation trust"              
                  5. end
                
                . generate Industry = .
                (4 missing values generated)
                
                . replace Industry = 20 if Q33_20_TEXT=="non-profit"
                (1 real change made)
                
                . list, clean
                
                                                 Q33_20_TEXT   Industry  
                  1.                              non-profit         20  
                  2.   Philanthropic Organization/Foundation          .  
                  3.            Environment and Conservation          .  
                  4.                 land conservation trust          .  
                
                .

                Comment


                • #9
                  Hi, William. Thank you so much for your help. I will do as you suggested and re-post to the forum with the Code you recommended in the proper format.

                  However, I am starting to wonder if Replace is even the right code?
                  I want to double check this with you in case maybe the problem lies in the way I am trying to explain my objective? I would like to do this by sharing an example that I found in a book (The Practice of Survey Research, Theory, and Applications, Ruel, Wagner, Gillespie, 2015).

                  Here, they provide the example of the Race variable:

                  "Q5 Would you say you are… (race) 1. White 2. Black 3. Asian 4. Hispanic 5. Other (please specify __________________)"

                  If someone responded "Other" and typed "Caucasian," they suggest re-coding this to the existing category ("1-White") in the Race variable.

                  This is what I am trying to achieve except rather than recode into an existing category of the variable "Q33_20_TEXT", I want to recode "non-profit" (and eventually the other responses) into a category of the variable Industry (in the case of "non-profit", I'd like to move the response to the category "20-Other" of the Industry variable).

                  I just want to confirm: Is "Replace" ultimately the right command to achieve this? I just want to make sure I have isolated the problem. I really appreciate your time and help.
                  Last edited by Sarah Soroui; 10 Feb 2019, 12:45.

                  Comment


                  • #10
                    Your question has led me to a possible explanation for your problem.

                    In the example from the text that you cite, there would have been two variables, which I will name race (coded 1-5) and race_other_text, which would have been blank when race is coded 1-4 and when race is coded 5 ("Other") would have had entries like "Caucasian". The text recommends you do something like
                    Code:
                    replace race=1 if race_other_text=="Caucasian"
                    Race is a fraught term, but suppose though the respondent had said "Oceanian" to mean descended from natives of islands in the South Pacific. (I believe I saw this in a survey categorization once.) Then you might not want to reclassify them to any of the categories 1-4, but leave them in category 5.

                    I think that's what's happening in your work.

                    Run the following code:
                    Code:
                    list Industry Q33_20_TEXT if Q33_20_TEXT != "" | Industry==20
                    Note that the character before the word "Industry" is the vertical bar character (|) meaning to Stata "or" and typically above the backslash (\) on US English keyboards. I think you will see the same four observations and that the Industry code is 20 for each of those observations. Industry code 20 doesn't just mean "other industry" it actually means "respondent provided some other answer which is recorded in Q33_20_TEXT".

                    So fo your particular example with "non-profit" you are trying to replace 20 with 20, and Stata correctly tells you it didn't change anything, and we mistakenly thought it was because nothing matched!

                    If none of the four answers in Q33_20_TEXT are to be changed to something other than 1-19, then you don't need to do any recoding, and 20 is what it is - other industries not included in the first 19 categories.
                    Last edited by William Lisowski; 10 Feb 2019, 13:58.

                    Comment


                    • #11
                      Oh wow! So, Q33_20_TEXT responses are already reflected in Industry variable 20-Other. What a massive oversight on my part. My sincere apologies for my confusion. I recognize now that I have been trying to replace something already reflected in the Industry variable 20-Other. William, thank you so much. I very much appreciate your help and very thorough explanations. Thank you!!!!

                      Comment

                      Working...
                      X