Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Discerning patterns in data from missing values

    Hi all,

    I have a large dataset, but Stata drops more than half of my observations when running regressions, due to missing values. I am seeking to understand the patterns in my missing data. For example, I have variables country and year in my dataset. I want to see if the missing values are correlated with a specific country or year/time period.

    I tried the command like mvpatterns, but I am not seeking to simply analyze the number of missing values per observation or the frequency in which it occurs. I haven't yet been able to find a command that would allow me to understand how the missing values are correlated with variables in my dataset. Is there such a command in STATA that exists?

    Sincerely,
    Yu Mna

  • #2
    Yu:
    welcome to this forum.
    Type -search mcartest- from within Stata and install it. Read also the help file and the article published in Stata Journal linked to -mcartest-.
    ​​​​​As per your description, you are probably more interested in the mechanism underlying your missing data than in their pattern (when dealing with missing values, mechanism and pattern have different meanings).
    You may be also interested in reading the -mi- suite entries in Stata .pdf manual.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      See missingplot (SSC) for another take. This program was inspired by examples using other software where it works well. Surprise: often it shows that the pattern of missing values is just a mess.

      Comment


      • #4
        There is also Stata's misstable.

        Comment


        • #5
          Thank you all, I ended up using misstable

          Comment

          Working...
          X