Hi guys,
I have a big problem with missing values in my data set.
Briefly: cross-sectional data, reason for the missing values is especially the irregulary interview turnus.
I decided to drop first all missing values with more than 50 percent missing. After that I dropped all missing values. Finally I got 120 observations in total.
What do you think: Can I do so? I could not find any rule of thumb in dropping missing values with more than 50 percent missing.
You can see that I have a lot of variables. For this reason I going to do a factor analysis in the next step. But in my opinion I need to carry missing values first.
Looking forward to hear from you! Many thanks in advance.
Vera
I have a big problem with missing values in my data set.
Briefly: cross-sectional data, reason for the missing values is especially the irregulary interview turnus.
Missing | Total |
436 | 4357 |
436 | 4,357 |
197 | 4,357 |
197 | 4,357 |
2,327 | 4,357 |
0 | 4,357 |
58 | 4,357 |
441 | 4,357 |
0 | 4,357 |
585 | 4,357 |
2,1 | 4,357 |
0 | 4,357 |
10 | 4,357 |
189 | 4,357 |
198 | 4,357 |
228 | 4,357 |
226 | 4,357 |
225 | 4,357 |
229 | 4,357 |
2 | 4,357 |
62 | 4,357 |
31 | 4,357 |
0 | 4,357 |
0 | 4,357 |
192 | 4,357 |
669 | 4,357 |
0 | 4,357 |
1,536 | 4,357 |
488 | 4,357 |
1,465 | 4,357 |
488 | 4,357 |
1,465 | 4,357 |
488 | 4,357 |
1,465 | 4,357 |
488 | 4,357 |
1,465 | 4,357 |
488 | 4,357 |
1,465 | 4,357 |
488 | 4,357 |
1,465 | 4,357 |
516 | 4,357 |
1,483 | 4,357 |
488 | 4,357 |
1,465 | 4,357 |
32 | 4,357 |
462 | 4,357 |
459 | 4,357 |
456 | 4,357 |
793 | 4,357 |
1,482 | 4,357 |
798 | 4,357 |
1,492 | 4,357 |
461 | 4,357 |
460 | 4,357 |
406 | 4,357 |
785 | 4,357 |
952 | 4,357 |
463 | 4,357 |
462 | 4,357 |
12 | 4,357 |
1,191 | 4,357 |
1,191 | 4,357 |
1,191 | 4,357 |
1,191 | 4,357 |
1,191 | 4,357 |
522 | 4,357 |
522 | 4,357 |
522 | 4,357 |
522 | 4,357 |
522 | 4,357 |
2,805 | 4,357 |
2,805 | 4,357 |
2,838 | 4,357 |
2,807 | 4,357 |
2,958 | 4,357 |
2,959 | 4,357 |
2,961 | 4,357 |
2,96 | 4,357 |
2,007 | 4,357 |
2,006 | 4,357 |
2,009 | 4,357 |
2,038 | 4,357 |
2,035 | 4,357 |
848 | 4,357 |
1,498 | 4,357 |
1,968 | 4,357 |
1,968 | 4,357 |
1,968 | 4,357 |
1,974 | 4,357 |
2,54 | 4,357 |
2,171 | 4,357 |
1,121 | 4,357 |
1,106 | 4,357 |
1,122 | 4,357 |
2,301 | 4,357 |
2,298 | 4,357 |
2,298 | 4,357 |
2,298 | 4,357 |
2,301 | 4,357 |
2,304 | 4,357 |
2,301 | 4,357 |
1,899 | 4,357 |
2,692 | 4,357 |
3,759 | 4,357 |
3,762 | 4,357 |
3,766 | 4,357 |
3,758 | 4,357 |
3,758 | 4,357 |
3,758 | 4,357 |
3,76 | 4,357 |
3,76 | 4,357 |
3,76 | 4,357 |
3,76 | 4,357 |
1,956 | 4,357 |
2,548 | 4,357 |
1,973 | 4,357 |
1,986 | 4,357 |
3,572 | 4,357 |
2,919 | 4,357 |
2,355 | 4,357 |
2,355 | 4,357 |
2,382 | 4,357 |
2,911 | 4,357 |
1,899 | 4,357 |
29 | 4,357 |
29 | 4,357 |
29 | 4,357 |
29 | 4,357 |
3,759 | 4,357 |
3,759 | 4,357 |
809 | 4,357 |
0 | 4,357 |
0 | 4,357 |
0 | 4,357 |
0 | 4,357 |
What do you think: Can I do so? I could not find any rule of thumb in dropping missing values with more than 50 percent missing.
You can see that I have a lot of variables. For this reason I going to do a factor analysis in the next step. But in my opinion I need to carry missing values first.
Looking forward to hear from you! Many thanks in advance.
Vera
Comment