Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Goodness-of-fit for binomial (not inverse binomial) variable

    I have a dataset containing 5,772,663 distinct values. If I plot them using a histogram a (more or less) nice normal curve appears, see my screenshot below (produced with the histogram command using 200 bins and the normal option)
    Click image for larger version

Name:	Bildschirmfoto 2021-11-27 um 15.06.44.png
Views:	1
Size:	68.2 KB
ID:	1638375

    Now, as everyone probably knows, testing whether something is normal is a headache and this is the same with my example. So my idea was to test whether we see a binomial distribution here. I haven't, to my surprise, found anything about goodness-of-fit for a binomial distribution. What I would do now is getting the data from the above graph (using serset, there are some very helpful posts here) and then work through the usual chi-squared-test. This seems very cumbersome. Is there another more straightforward procedure to do this? Thanks in advance.

  • #2
    Why do you want to do this? With N well over 5,000,000, it is highly likely that any goodness of fit test to any parametric distribution will fail even though the data are obviously very close to a normal distribution--close enough for any normal-theory based analysis to work just fine. What analysis for this variable do you have in mind? If it is so sensitive to departures from some specific distribution that this kind of data will not be adequately handled by it, then the analysis is, for practical purposes, useless anyway.

    Comment

    Working...
    X