Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bootstrapping for logistic regression on imbalanced data

    Hi there, I've googled quite a bit but haven't quite figured out how to implement this in STATA.

    I want to use bootstrapping in logistic regression modeling. The data set is imbalanced (75% = 1, 25% = 0). I want run the regression 1000 times (rep=1000) with samples that are equal in size for 1 and 0. I'd like to take samples of 200 on each side. Next, I want to store the parameter estimates and compute the 95% confidence intervals from the sample distributions (histograms of stored estimated parameters). I also want to store the Chi-square value to see the distribution of significance of the overall models.

    Any advice and help is appreciated!

    Isabel

  • #2
    why are you limiting the sample to 200?

    Comment


    • #3
      I share George's wondering about the method here, and would add some wondering about the goal. Perhaps some sort of counterfactual is wanted, i.e. "what would the sampling distribution have been had I started from a sample drawn from a population balanced with respect to my binary outcome?"

      I'm not sure "googling" was the best way to get ideas here, as Stata's -help bootstrap-, with particular attention to the -reps(), -strata()- and -size- options, would speak directly to some issues here. Perhaps that had already been examined, but if the original question had pointed to some of what is discussed there, that would have enabled an answer that better helped to clarify understanding. That said, I would suggest that something along the following lines should work:
      Code:
      bootstrap, reps(1000) size(200) strata(y01) : logistic y01 x1 x2 x3 .....
      where "y01" is the existing response variable.
      If a visual examination of histograms is wanted, or if some other detailed examination of the bootstrap samples is desired, then the -saving()- option of bootstrap also would be useful.

      Comment


      • #4
        What Mike said.

        Comment

        Working...
        X