Changing the value of many observations

Tor Haug Anonsen

Join Date: Oct 2021

Posts: 44
#1

Changing the value of many observations

30 Oct 2021, 03:37

Hello

So, i'm working on a pretty big dataset for a Norwegian insurance company, and I wanted to change one variable in regards to its observations, or rather what the value Stata "reads" from it. Initially the observations were string and essentially showed "Expired" for an insurance offer that was no longer valid, "rejected" for a rejected insurance offer (by the company) and "accepted" was just missing values.
I started off using "decode" which made "expired" = 1, rejected = 2 and "accepted" = "." (Just a period). I do not really need the "rejected" observations so i essentially want to make the "expired" = 0 and "accepted" = 1. The variable name is "avsluttet_aarsak" (Which loosely translates to "reason for termination/closure")
I used the code:

sort avsluttet_aarsak
replace avsluttet_aarsak = 0 in 1/762621. --> Which worked for the "expired" observations, but when i wrote:
replace avsluttet_aarsak = 1 in 76262/1500403 --> then all the "accepted" observations became "rejected".

I assume this is because the underlying value that the "rejected" observations have is 1, due to the initial decoding, and this is why that happens.

Does anyone know of an easier and better way of doing this? I'm quite new to Stata so any help/input would be much appreciated

Last edited by Tor Haug Anonsen; 30 Oct 2021, 03:43.
Tags: changing values

Fei Wang

Join Date: Oct 2021
Posts: 726

30 Oct 2021, 04:03

I think you are encoding instead of decoding.

Code:

encode initial_var, generate(new_var)    // encode initial string variable to numeric variable
drop if new_var == 2    // delete "rejected" obs
recode new_var (1 = 0) (missing = 1)    // make "expired" = 0 and "accepted" = 1
label define new_lab 1 "accepted" 0 "expired"    // redefine value labels
label values new_var new_lab    // assign new labels to new_var

Comment

Tor Haug Anonsen

Join Date: Oct 2021

Posts: 44
#3

30 Oct 2021, 04:15

Originally posted by Fei Wang View Post

I think you are encoding instead of decoding.

Code:

encode initial_var, generate(new_var) // encode initial string variable to numeric variable drop if new_var == 2 // delete "rejected" obs recode new_var (1 = 0) (missing = 1) // make "expired" = 0 and "accepted" = 1 label define new_lab 1 "accepted" 0 "expired" // redefine value labels label values new_var new_lab // assign new labels to new_var

Yes, of course. So I should then go back to the dataset when the observations were string and then encode it? Or is there a way of making them back to string and then write the code that you presented?
Anyways, thank you so much, i will try this right away.
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#4

30 Oct 2021, 04:19

Originally posted by Tor Haug Anonsen View Post

Yes, of course. So I should then go back to the dataset when the observations were string and then encode it? Or is there a way of making them back to string and then write the code that you presented?
Anyways, thank you so much, i will try this right away.

If your variable has already been encoded, then run the code from the second line ("new_var" is just a name example, your variable is named "avsluttet_aarsak"). BTW, you may need to double-check whether "rejected" is indeed equal to 2.
Comment
Tor Haug Anonsen

Join Date: Oct 2021

Posts: 44
#5

30 Oct 2021, 04:53

Originally posted by Fei Wang View Post

If your variable has already been encoded, then run the code from the second line ("new_var" is just a name example, your variable is named "avsluttet_aarsak"). BTW, you may need to double-check whether "rejected" is indeed equal to 2.

Awesome, this worked perfectly. After a lot of attempts I see that I was missing the last bit of code that would assign the labels to the variable.
Again, thank you so much!!
Comment

Announcement

Changing the value of many observations

Comment

Comment

Comment

Comment