Dear Statalists,

I have a discrete variable, that takes integer values larger or equal to zero. I want to cut the values into categories and am currently trying to write a procedure to find smallest "bin" size such that for each value, there are at least three observations in the data. As I don't have access to the data it has to be automized to check whether the current "bin" width satisfied the criterion and if not increase it by 1.

Here is some bit of non-working code which might clarify more what I am intending to do. I had the idea to cut the variable with egen, then tab it and loop through the matrix to assert that all values were > 2.

There is still a lot of parts missing because I am not sure how to implemente the CONTINUE and the GO TO LOOP ABOVE, and also that the loop should stop if the second loop ran through completely. But that was anyway just to give an idea as I am sure there are much more efficient ways to go about this. So, I am happy to hear you ideas. Thank you very much in advance!

Best,

Felix

I have a discrete variable, that takes integer values larger or equal to zero. I want to cut the values into categories and am currently trying to write a procedure to find smallest "bin" size such that for each value, there are at least three observations in the data. As I don't have access to the data it has to be automized to check whether the current "bin" width satisfied the criterion and if not increase it by 1.

Here is some bit of non-working code which might clarify more what I am intending to do. I had the idea to cut the variable with egen, then tab it and loop through the matrix to assert that all values were > 2.

Code:

qui su x local end = `r(max)' tempname A temp local r = 1 foreach i = 1(1)100{ cap drop `temp' egen `temp' = cut(x, at(0(`i')`end')) qui tab `temp', matcell(`A') forvalues i = 1(1)`= rowsof(`A')'{ if `A'[`i',1] > 2 { CONTINUE } else { GO TO LOOP ABOVE }

Best,

Felix

## Comment