Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Changing the value of many observations

    Hello

    So, i'm working on a pretty big dataset for a Norwegian insurance company, and I wanted to change one variable in regards to its observations, or rather what the value Stata "reads" from it. Initially the observations were string and essentially showed "Expired" for an insurance offer that was no longer valid, "rejected" for a rejected insurance offer (by the company) and "accepted" was just missing values.
    I started off using "decode" which made "expired" = 1, rejected = 2 and "accepted" = "." (Just a period). I do not really need the "rejected" observations so i essentially want to make the "expired" = 0 and "accepted" = 1. The variable name is "avsluttet_aarsak" (Which loosely translates to "reason for termination/closure")
    I used the code:

    sort avsluttet_aarsak
    replace avsluttet_aarsak = 0 in 1/762621. --> Which worked for the "expired" observations, but when i wrote:
    replace avsluttet_aarsak = 1 in 76262/1500403 --> then all the "accepted" observations became "rejected".

    I assume this is because the underlying value that the "rejected" observations have is 1, due to the initial decoding, and this is why that happens.

    Does anyone know of an easier and better way of doing this? I'm quite new to Stata so any help/input would be much appreciated
    Last edited by Tor Haug Anonsen; 30 Oct 2021, 03:43.

  • #2
    I think you are encoding instead of decoding.

    Code:
    encode initial_var, generate(new_var)    // encode initial string variable to numeric variable
    drop if new_var == 2    // delete "rejected" obs
    recode new_var (1 = 0) (missing = 1)    // make "expired" = 0 and "accepted" = 1
    label define new_lab 1 "accepted" 0 "expired"    // redefine value labels
    label values new_var new_lab    // assign new labels to new_var

    Comment


    • #3
      Originally posted by Fei Wang View Post
      I think you are encoding instead of decoding.

      Code:
      encode initial_var, generate(new_var) // encode initial string variable to numeric variable
      drop if new_var == 2 // delete "rejected" obs
      recode new_var (1 = 0) (missing = 1) // make "expired" = 0 and "accepted" = 1
      label define new_lab 1 "accepted" 0 "expired" // redefine value labels
      label values new_var new_lab // assign new labels to new_var
      Yes, of course. So I should then go back to the dataset when the observations were string and then encode it? Or is there a way of making them back to string and then write the code that you presented?
      Anyways, thank you so much, i will try this right away.

      Comment


      • #4
        Originally posted by Tor Haug Anonsen View Post

        Yes, of course. So I should then go back to the dataset when the observations were string and then encode it? Or is there a way of making them back to string and then write the code that you presented?
        Anyways, thank you so much, i will try this right away.
        If your variable has already been encoded, then run the code from the second line ("new_var" is just a name example, your variable is named "avsluttet_aarsak"). BTW, you may need to double-check whether "rejected" is indeed equal to 2.

        Comment


        • #5
          Originally posted by Fei Wang View Post

          If your variable has already been encoded, then run the code from the second line ("new_var" is just a name example, your variable is named "avsluttet_aarsak"). BTW, you may need to double-check whether "rejected" is indeed equal to 2.
          Awesome, this worked perfectly. After a lot of attempts I see that I was missing the last bit of code that would assign the labels to the variable.
          Again, thank you so much!!

          Comment

          Working...
          X