Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Count missing observations in Panel data

    Hi there,

    How can I count the missing observations in panel data by ID and Year wise?

    Thanks for your time!

  • #2
    For variable x


    Code:
    egen wanted = total(missing(x)) , by(ID Year)

    Comment


    • #3
      Nick

      This code does not work.

      For instance, I run
      Code:
      egen wanted = total(missing( gdp)) , by(id year)
      
       sum wanted
      
          Variable |       Obs        Mean    Std. Dev.       Min        Max
      -------------+--------------------------------------------------------
            wanted |      8134    .3840669    .4864037          0          1
      Moreover, I need all the variables missing values sorting by year and id.

      Example
      Year Total missing observations (all variables)
      2001 55
      2002 65
      2003 0
      2004 74
      Country ID Total missing observations (all variables)
      1 0
      2 7
      3 9
      4 12
      I'm looking for a similar outcome through code. Thanks for help

      Comment


      • #4
        Not so; it works. I didn't suggest that you summarize the results without qualification.

        Or rather, your complaint appears to be that the code doesn't address questions you didn't ask in #1.

        I ignored the wording "missing observations" as in itself without meaning. Your observations must have non-missing values on identifier and year for you to ask these questions at all. Otherwise what do you want precisely?

        1. Observations with any other variable missing.

        2. Observation with all other variables missing.

        I could make guesses, but best for you to clarify what you want.




        Comment


        • #5
          Yes, I did wrong by summarizing. I could not understand how to make a table (outcome) as I have shown an example in #2.

          I have a panel dataset of 30 countries for 20 years and 12 variables; I wanted to count the total missing observations (missing observations in any variable) by year wise one table and by country-wise another table.

          Comment


          • #6
            Sounds like

            Code:
            ds id year, not
            
            egen missing = rowmiss(`r(varlist)') 
            
            tab id if missing 
            
            tab year if missing
            ​​​​​​​

            Comment


            • #7
              I could not understand this code. It produces the below outcome. Correct me if I'm wrong.

              Code:
              ds id year, not
              
              extd_total ers bmgdp extd reer tradegdp mr mi bmr tservice sfranc bc           
              
              . egen missing = rowmiss(`r(extd_total ers bmgdp extd reer tradegdp mr mi bmr tservice sfranc bc )')
              
              . tab id if missing
              no observations
              
              . tab year if missing
              no observations

              Comment


              • #8
                You didn't type what I suggested. I meant what I typed. (If you don't understand code, fair enough, but guessing that it might be something else you don't understand either is not a good tactic.)

                Here is a reproducible example. In the auto dataset, there are are 5 observations with missing values and they occur on only one variable, rep78.

                Code:
                . sysuse auto, clear
                (1978 Automobile Data)
                
                . ds make, not
                price         rep78         trunk         length        displacement  foreign
                mpg           headroom      weight        turn          gear_ratio
                
                . egen missing = rowmiss(`r(varlist)')
                
                . tab foreign if missing
                
                   Car type |      Freq.     Percent        Cum.
                ------------+-----------------------------------
                   Domestic |          4       80.00       80.00
                    Foreign |          1       20.00      100.00
                ------------+-----------------------------------
                      Total |          5      100.00
                
                . tab rep78 if missing
                no observations
                
                . tab rep78 if missing, missing
                
                     Repair |
                Record 1978 |      Freq.     Percent        Cum.
                ------------+-----------------------------------
                          . |          5      100.00      100.00
                ------------+-----------------------------------
                      Total |          5      100.00

                Comment


                • #9
                  I have mistaken the word "varlist". Thank you

                  Comment


                  • #10
                    Pleased you solved the problem. Thanks for the closure,

                    Comment


                    • #11
                      While running xtreg, i have many variables which have much missing data. Hence, when I run the panel regression, for some group (ID) only one observation is used in regression due to missing independent variable data. While running regression i want to remove the IDs whose observation is less than 10. How can I do that. Please help!!

                      Comment


                      • #12
                        Say the regression that you are running is
                        xtreg y x1 x2 x3

                        Try

                        Code:
                        egen nonmissings = count(y + x1 + x2 + x3), by(id)
                        
                        xtreg y x1 x2 x3 if count>9
                        Originally posted by Ujjawal Sawarn View Post
                        While running xtreg, i have many variables which have much missing data. Hence, when I run the panel regression, for some group (ID) only one observation is used in regression due to missing independent variable data. While running regression i want to remove the IDs whose observation is less than 10. How can I do that. Please help!!

                        Comment


                        • #13
                          Thank you for your response. I am getting not sorted error continuously, although I tried several changes.

                          Comment

                          Working...
                          X