I am using European Social Survey data (wave 9) and working on the variable ‘wkhtot’ (with 41,630 valid observations) measuring average weekly working hours. The variable includes 419 observations with weekly working hours greater than 80.
I want to create a copy of the original variable replacing all extreme values with 80, excluding those with missing values. To build a new variable, I ran the following commands:
gen wkhtot_mod = wkhtot
replace wkhtot_mod = 80 if (wkhtot > 80 & wkhtot != . )
As a result, Stata replaced all extreme values AND the missing values with 80. Why is that?
I did realise that I can reach the intended result with the following:
replace wkhtot_mod = 80 if (wkhtot > 80 & wkhtot < . )
I would still like to understand, why didn't the != (not) operator work in my first attempt.
I would appreciate any insight on this.
I want to create a copy of the original variable replacing all extreme values with 80, excluding those with missing values. To build a new variable, I ran the following commands:
gen wkhtot_mod = wkhtot
replace wkhtot_mod = 80 if (wkhtot > 80 & wkhtot != . )
As a result, Stata replaced all extreme values AND the missing values with 80. Why is that?
I did realise that I can reach the intended result with the following:
replace wkhtot_mod = 80 if (wkhtot > 80 & wkhtot < . )
I would still like to understand, why didn't the != (not) operator work in my first attempt.
I would appreciate any insight on this.
Comment