Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I create a simple table of missing observations?

    Hello,

    I've got a panel dataset with 195 countries and various explanatory variable for each state. I'm trying to create a simple table that shows the states that are missing one or more observations.

    I've created a binary "missID" variable, but when I use the code below, I get a long table with nations repeating (since it's country-year is the unit of analysis).

    In addition, because it's not an estimation, I am unable to output this using Outreg2.

    I'm sure there's an easy way to create a .CSV that lists the missing nations, but I can't seem to find it from searching online. Does anyone have some code they can share?

    Thanks in advance!

    -nick


    . list cname if missID == .

    +--------------------------------+
    | cname |
    |--------------------------------|
    9. | Bahamas, The |
    10. | Bahamas, The |
    11. | Bahamas, The |
    12. | Bahamas, The |
    13. | Cuba |
    |--------------------------------|
    14. | Cuba |
    15. | Cuba |
    16. | Cuba |
    17. | Haiti |
    20. | Haiti |

    etc...

  • #2
    Try this:

    Code:
    keep if missID
    keep cname
    duplicates drop
    export delimited using countries_with_missing_observations.csv // CHOOSE APPROPRIATE OPTIONS

    Comment


    • #3
      Hmm... That seems to generate (in Stata 12) a table of all the observations.

      Comment


      • #4
        Some lack of clarity here. Do you mean

        1. countries that have any observations with any missing values on any variables

        2. countries that have gaps, so that observations are omitted ("missing observations")

        3. countries that should be in the dataset but aren't (""missing nations")

        or something else.

        I am guessing that you meant #1 but before spending time on thinking up code (noting also that you don't provide a workable data example) it would be helpful to get confirmation.

        You've asked a lot of questions here, so should know the drill, but FAQ Advice #12 still applies.

        Comment


        • #5
          Apologies for being unclear: Yes, it's #1, but with the distinction that I'm first trying to determine which countries have missing data in any variable for all years.

          I can't post the full dataset due to restrictions, but it contains 195 countries with each country having 4 years worth of data.

          The code I'm using (per an example on the old Stata list) is:
          gen missID = RPEgdp5 + polity2_5 + left5 + pm25
          list cname if missID == .
          export excel using FILE NAME

          Comment


          • #6
            Here's some sample data (there is no need to provide your private dataset, just something that helps us to understand and work with in Stata) and code that will get you close to what you want, I think:

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input float(country year x1 x2 x3 x4)
            1 2000 1 0 1 0
            1 2001 0 1 1 0
            1 2002 . 0 1 0
            1 2003 1 . 1 .
            2 2000 0 1 2 0
            2 2001 1 0 2 0
            2 2002 1 1 2 0
            2 2003 0 . 0 .
            3 2000 1 0 0 0
            3 2001 1 0 0 0
            3 2002 1 1 0 0
            3 2003 0 0 1 1
            end
            
            list, sepby(country)
            
            egen has_miss=rowmiss(x1 x2 x3 x4)
            bysort country: egen n_miss=total(has_miss)
            egen tag=tag(country)
            
            list country if tag==1 & n_miss>0
            *or
            tab country if tag==1 & n_miss>0
            Stata/MP 14.1 (64-bit x86-64)
            Revision 19 May 2016
            Win 8.1

            Comment


            • #7
              See http://www.statalist.org/forums/foru...aging-missings for one tool.

              But let's approach this directly.

              Suppose you have variables country year a b c

              Then you first count within observations how many missings you have across those variables.

              If within countries, the smallest number of such missings is positive, then that number is always positive.

              So, you just want a list of any such countries.

              I downloaded these data from the Hogwarts website. I can't give the URL here due to restrictions applying to Muggles.


              Code:
              clear
              input str9 country year a b c
              "Lorien"  1776 1  2 3
              "Lorien"  1812 4  5 6
              "Ruritania" 1859 7 8 9
              "Ruritania" 1952 10 11 .
              "Mordor" 2014 . 13 14
              "Mordor" 2015 15 16 .
              end
              
              egen nmissing = rowmiss(a b c)  
              egen all_bad = min(nmissing), by(country)
              egen tag = tag(country)
              list, sepby(country)
              
                   +------------------------------------------------------------+
                   |   country   year    a    b    c   nmissing   all_bad   tag |
                   |------------------------------------------------------------|
                1. |    Lorien   1776    1    2    3          0         0     1 |
                2. |    Lorien   1812    4    5    6          0         0     0 |
                   |------------------------------------------------------------|
                3. | Ruritania   1859    7    8    9          0         0     1 |
                4. | Ruritania   1952   10   11    .          1         0     0 |
                   |------------------------------------------------------------|
                5. |    Mordor   2014    .   13   14          1         1     1 |
                6. |    Mordor   2015   15   16    .          1         1     0 |
                   +------------------------------------------------------------+
              
              list country if all_bad & tag
              
                   +---------+
                   | country |
                   |---------|
                5. |  Mordor |
                   +---------+
              Does that help?

              Note: Carole evidently has similar ideas.

              Comment


              • #8
                Thank you Drs. Wilson and Cox! And thanks also for the nmissing package, which I have now installed.

                Comment

                Working...
                X