Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I have an unbalanced panel data with group size from 1 to 9. How can i remove observations where the group size is less than a specified number, say 3?

    Comment


    • #17
      Kiivenger:
      you should not do this, as you would end up with a dataset that would not mirror the original one.
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #18
        Carlo,

        But a group size of 1 may make no sense in a panel data analysis. Would it?

        Your response is sincerely appreciated.

        Thanks.

        Raj

        Comment


        • #19
          Raj:
          you can have a panel with one observation only, as data are what they are and we have to olive with them.
          That panel will contribute to the regression with one observation only.
          It would be more interesting to investigate why a panel has one observation only.
          Kind regards,
          Carlo
          (Stata 18.0 SE)

          Comment


          • #20
            Thanks, Carlo. I have a fairly large dataset of 14800 observations (N) with 9 time periods. Some observations get deleted because other control variables are not available.

            Is there a way to delete groups with few observations? I just want to see if the results are any different. I tried to use the following code:

            xtdes
            local maxobs=r(max)
            sort tic1 year
            by tic1: g obsn=_N
            drop if obsn<3

            where tic1 is my 'panelvar'.



            Even after I removed groups with fewer than 3 observations, I am getting these results:

            Between regression (regression on group means) Number of obs = 14,782

            Group variable: tic1 Number of groups = 2,926


            R-sq: Obs per group:

            within = 0.0240 min = 1

            between = 0.2791 avg = 5.1

            overall = 0.2500 max = 9


            F(10,2915) = 112.87

            sd(u_i + avg(e_i.))= 1.01033 Prob > F = 0.0000



            How can this be? The minimum 'Obs per group' must be 3 NOT 1 above.

            I am confused. Any ideas?

            Thanks and warm regards.

            Raj

            Comment


            • #21
              Raj:
              for the future please use CODE delimiters to share what you typed and what Stata gave you back. Thanks.
              The issue may be that you have missing values in other variables; by default Stata omits observation with missing values in any variable.
              Kind regards,
              Carlo
              (Stata 18.0 SE)

              Comment


              • #22
                Carlo:

                But, The minimum 'Obs per group' must be 3 NOT 1 above. Right?

                Raj
                (Stata15)

                Comment


                • #23
                  Raj:
                  it depends.
                  From the following toy-example, you can see that, even though each panel has 3 observations, only 4 out of 6 are included in -summarize-, due to missing values in -A-:
                  Code:
                  . set obs 6
                  number of observations (_N) was 0, now 6
                  
                  . g id=1 in 1/3
                  (3 missing values generated)
                  
                  . replace id=2 if id==.
                  (3 real changes made)
                  
                  . g A=runiform() in 2/5
                  (2 missing values generated)
                  
                  . bysort id: keep if _N==3
                  (0 observations deleted)
                  
                  . sum A
                  
                      Variable |        Obs        Mean    Std. Dev.       Min        Max
                  -------------+---------------------------------------------------------
                             A |          4    .1952401    .1413651   .0285569   .3488717
                  
                  . list
                  
                       +---------------+
                       | id          A |
                       |---------------|
                    1. |  1          . |
                    2. |  1   .3488717 |
                    3. |  1   .2668857 |
                    4. |  2   .1366463 |
                    5. |  2   .0285569 |
                       |---------------|
                    6. |  2          . |
                       +---------------+
                  
                  .
                  Kind regards,
                  Carlo
                  (Stata 18.0 SE)

                  Comment


                  • #24
                    Thanks, Carlo. That is a good example you provided. So, in a case like that how can I exclude id's if there are less than 3 usable observations for each id? I just want to want to do sensitivity analysis, if the original results hold up or not. Is there a way to do this?

                    Best regards,

                    Raj Iyengar
                    (Stat 15.1 SE)

                    Comment


                    • #25
                      Carlo,

                      I also posted a question on another forum for selecting size-matched control firms:
                      https://www.statalist.org/forums/for...d-gender/page4

                      Your assistance is sincerely appreciated.

                      Best regards,

                      Raj Iyengar

                      Comment


                      • #26
                        Raj:
                        you actually posted another topic on the same (General) forum.
                        Kind regards,
                        Carlo
                        (Stata 18.0 SE)

                        Comment

                        Working...
                        X