No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing values in regression.

    Dear all I'm performing a multilevel regression out of a dataset that contains missing values. As a result of that, my regression model have a different size due to those missing values.
    These are the regressions:
    Click image for larger version

Name:	regs.PNG
Views:	1
Size:	31.4 KB
ID:	1657640

    By looking at the sample size we notice that there are 16381 observation with no missing values in those variables.
    My question is if is it better to drop the observations with missing values in order to build all those regression on the same subset of 16381 observation?
    I did not drop the rows with missing value because I did not perform between models tests that require the same sample size and I wanted not to lose information by dropping observations.
    What would you suggest?
    Thank you all in advance
    Last edited by Luca Tognoni; 03 Apr 2022, 07:21.

  • #2
    the usual recipe includes:
    1) do not drop observations with missing values to avoid making up your dataset and end up with a subsample of it that has a tenuous relationship with the original source;
    2) try to collect the missing data via queries sent out to researchers (in case of an empirical study);
    3) if 2) goes awry, diagnosing the mechanism underlying the missingness (MCAR; MRA;MNRA) and, simetines, the model (monotonic; generalized; univariate missingness);
    3) deal with missing data accordingly.
    Kind regards,
    (StataNow 18.5)


    • #3
      In essence a model fit ignores observations with missing values so dropping them leads to the same results.


      • #4
        Thank you Carlo and Nick