Same commands different numbers

Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#16

01 Jan 2019, 12:13

Run:

Code:

duplicates tag tinh huyen xa diaban hoso, gen(flag) browse if flag

This will show you the observations that are causing the problem. It is then up to you to figure out what to do to fix it. Broadly speaking there are a few possibilities (though the details of how you fix them are very numerous):

1. These observations are all correct and belong in the data. You simply have misunderstood the structure of your data when you thought that tinh huyen xa diaban hoso wold uniquely identify observations. Then the question arises whether there is some other variable (or perhaps more than one) that, combined with tinh huyen xa diaban hoso uniquely identifies the observations. If so, you can modify your code accordingly, adding that variable (or those variables) to the -sort- or -isid, sort- commands. Your calculations will then become deterministic..

If not, then you need to completely rethink your analysis because it is dependent on the arbitrary ordering of the observations in the data. You need a different algorithm.

2. Some of these observations contain incorrect data, or they contain correct data but do not really belong in this data set. Then you have to go back to how this data set was created and fix the problems that led to the inclusion of these observations or the incorrect data. A subtle version of this is that the observations themselves are correct as far as they go, but they should have been combined in some way (perhaps taking averages of the variables other than tinh huyen xa diaban hoso, or something like that) into a single observation.

Either way, in the end, you need to get a better understanding of your data or of the algorithm you are trying to apply to it. It is impossible to give more specific advice from a distance.
Comment
Linh mt

Join Date: May 2017

Posts: 33
#17

02 Jan 2019, 07:47

Thank you very much for your advice Clyde. I wil look at the data again and try to find out the underlying cause of problem.
PS. I really suprise that you know the vars in my data set I am working on. I remember that I have not given to you ))
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30084
#18

02 Jan 2019, 09:58

You did mention those variables in #15. Take a look back at it.

but the error is variables tinh huyen xa diaban hoso do not uniquely identify the observations
Comment

Announcement

Comment

Comment

Comment