Hello all
I have penal data with a "variable Net weekly pay". This variable has negative values because of the way of collecting data. In other words, the " -9" mean that this question does not apply to this individual, so they coded "-9."
My question is :
Should I recode the "-9" as a missing value when analysing the data?
I tried to run regression both ways with missing values and with -9. So it gave me significantly different results because my total observation is around 70000, and this variable of Net weekly pay has the majority as "-9" (about 48000)
What is the best way to deal with this matter?
Many thanks
I have penal data with a "variable Net weekly pay". This variable has negative values because of the way of collecting data. In other words, the " -9" mean that this question does not apply to this individual, so they coded "-9."
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double PERSID byte quarter long NETWK 10101010101 6 -9 10101010101 8 -9 10101010101 9 -9 10102020102 6 99 10102020102 7 -9 10102020102 8 -9 10102020102 9 -9 10102020102 10 44 10104030101 8 346 10104030101 9 -9 10104030101 10 -9 10104030101 11 -9 10104030101 12 350 10104030102 8 -9 10104030102 9 -9 10104030102 10 -9 10104030102 11 -9 10104030102 12 -9 10203030101 7 190 10203030101 8 -9 10203030101 9 -9 10203030101 10 -9 10203030101 11 160 10303030101 7 -9 10303030101 8 -9 10303030101 9 -9 10303030101 10 -9 10303030101 11 -9 10304050102 8 438 10501030101 5 416 10501030101 6 -9 10501030101 7 -9 10501030101 8 -9 10501030101 9 415 10602020101 6 -9 10602020101 7 -9 10602020101 8 -9 10602020101 9 -9 10602020101 10 -9 10603070101 7 -9 10603070101 8 -9 10603070101 9 -9 10603070101 11 -9 10604060102 8 635 10604060102 9 -9 10604060102 10 -9 10604060102 11 -9 10604060102 12 -8 10604070101 11 -9 10604070101 12 -9 10604070103 8 115 10604070103 9 -9 10604070103 10 -9 10604070103 11 -9 10604070103 12 -8 10701010101 5 314 10701010101 6 -9 10701010101 7 -9 10701010101 8 -9 10701010101 9 300 10793010101 4 -9 10793010101 5 -9 10793010101 6 -9 10793010101 7 383 10794010101 4 485 10794010101 5 -9 10794010101 6 -9 10794010101 7 -9 10794010101 8 454 10794010102 4 519 10794010102 5 -9 10794010102 6 -9 10794010102 7 -9 10794010102 8 219 10801020103 5 -9 10801020103 8 -9 10801020103 9 395 10801020104 8 -9 10801020104 9 166 10802020101 6 438 10802020101 7 -9 10802020101 8 -9 10802020101 9 -9 10802020101 10 438 10802020102 6 254 10802020102 7 -9 10802020102 8 -9 10802020102 9 -9 10802020102 10 277 10993020101 4 -9 10993020101 6 -9 10993020102 4 -9 10993020102 5 -9 10993020102 6 -9 10993020102 7 923 11001010101 5 -9 11001010101 6 -9 11001010101 7 -9 11001010101 8 -9 11001010101 9 -9 end label values quarter quarter label def quarter 4 "Oct-Des 2019", modify label def quarter 5 "Jan-Mar 2020", modify label def quarter 6 "Apr-June 2020", modify label def quarter 7 "July-Sep 2020", modify label def quarter 8 "Oct-Des 2020", modify label def quarter 9 "Jan-Mar 2021", modify label def quarter 10 "Apr-June 2021", modify label def quarter 11 "July-Sep 2021", modify label def quarter 12 "Oct-Des 2021", modify label values NETWK NETWK5 label def NETWK5 -9 "Does not apply", modify label def NETWK5 -8 "No answer", modify
My question is :
Should I recode the "-9" as a missing value when analysing the data?
I tried to run regression both ways with missing values and with -9. So it gave me significantly different results because my total observation is around 70000, and this variable of Net weekly pay has the majority as "-9" (about 48000)
What is the best way to deal with this matter?
Many thanks
Comment