Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Making panel dataset balanced - "filling down"

    Hi All,

    I had posted this earlier, but had incorrectly conveyed my qualm. The dataset I have resembles the following:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float country str7 Gender str12 Education float(AverageValue year)
    1 "Male"    "Not Educated" 2000 2000
    1 "Female"  "Educated"     3000 2000
    2 "Male"    "Educated"     3000 2000
    3 "Female " "Not Educated" 4000 2000
    1 "Male"    "Educated"     3000 2001
    1 "Female"  "Educated"     3000 2001
    2 "Female"  "Educated"     3000 2001
    3 "Male"    "Educated"     3000 2001
    3 "Male"    "Not Educated" 2000 2001
    3 "Female"  "Educated"     3000 2001
    3 "Female"  "Not Educated" 2000 2001
    end

    Here, I have data by country and year, on the average wages by education level (educated or not) of males and females. For expositional purposes, there are only 3 countries, and 2 years. A balanced panel consists of data on male and female average wages, for both education levels (educated or not). A complete set of observations is for country 3, in year 2001.

    I wish to make this panel dataset balanced, i.e. fill in place holders even for combinations of missing observations. This would mean that for 2000, I would expand "down" for country 1, have two more cells (one for educated males and one for not educated females), but with missing values for average value. For these missing values, I will be using an econometric model to impute them. But, in order to perform the imputation, I need to have this panel dataset balanced.

    Any guidance on this is much appreciated.


    Many Thanks,
    CS

  • #2
    Chinmay:
    trying to convert an unbalanced panel into a balanced one is, in general, a bad idea, because, by ignoring the mechanisms and the patterns underlying data missingness, in all likelihood you will end up with a panel that is far from the original one.
    At the top of that, Stata can handle both unbalanced and balanced panle datasets without any problem.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Code:
      help fillin
      fillin adds observations with missing data so that all interactions of varlist exist, thus making a complete rectangularization of varlist.

      I’m not suggesting you should use this command (ignoring Carlo’s sound advice) but this is the command that does what you want.
      Last edited by Paul Dickman; 23 Jul 2020, 01:38.

      Comment


      • #4
        As pointed out by the other members, one should be very clear and theoretically correct in filling missing values. If you are, then you can use fillmissing program, that is available on the SSC, once the filling command creates the empty observations
        Code:
        ssc install fillmissing
        
        clear
        webuse fillin1
        fillin sex race age_group
        list
        
             +----------------------------------------------------+
             |    sex    race   age_gr~p      x1     x2   _fillin |
             |----------------------------------------------------|
          1. | female   white      20-24   20393   14.5         0 |
          2. | female   white      25-29       .      .         1 |
          3. | female   white      30-34       .      .         1 |
          4. | female   black      20-24       .      .         1 |
          5. | female   black      25-29       .      .         1 |
             |----------------------------------------------------|
          6. | female   black      30-34   39399   14.2         0 |
          7. |   male   white      20-24       .      .         1 |
          8. |   male   white      25-29   32750   12.7         0 |
          9. |   male   white      30-34       .      .         1 |
         10. |   male   black      20-24       .      .         1 |
             |----------------------------------------------------|
         11. |   male   black      25-29       .      .         1 |
         12. |   male   black      30-34       .      .         1 |
             +----------------------------------------------------+
        
        . bys sex: fillmissing x1
        (9 real changes made)
        
        . list
        
             +----------------------------------------------------+
             |    sex    race   age_gr~p      x1     x2   _fillin |
             |----------------------------------------------------|
          1. | female   white      20-24   20393   14.5         0 |
          2. | female   white      25-29   20393      .         1 |
          3. | female   white      30-34   20393      .         1 |
          4. | female   black      20-24   20393      .         1 |
          5. | female   black      25-29   20393      .         1 |
             |----------------------------------------------------|
          6. | female   black      30-34   39399   14.2         0 |
          7. |   male   white      20-24   32750      .         1 |
          8. |   male   white      25-29   32750   12.7         0 |
          9. |   male   white      30-34   32750      .         1 |
         10. |   male   black      20-24   32750      .         1 |
             |----------------------------------------------------|
         11. |   male   black      25-29   32750      .         1 |
         12. |   male   black      30-34   32750      .         1 |
             +----------------------------------------------------+
        If want to fillmissing with mean value, then
        Code:
        bys sex: fillmissing x1, with(mean)
        Last edited by Attaullah Shah; 23 Jul 2020, 05:22.
        Regards
        --------------------------------------------------
        Attaullah Shah, PhD.
        Professor of Finance, Institute of Management Sciences Peshawar, Pakistan
        FinTechProfessor.com
        https://asdocx.com
        Check out my asdoc program, which sends outputs to MS Word.
        For more flexibility, consider using asdocx which can send Stata outputs to MS Word, Excel, LaTeX, or HTML.

        Comment


        • #5
          Thank you everyone, for the sound advice!

          Best,
          CS

          Comment

          Working...
          X