Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing data in panel data

    Hi,

    I am working on panel data for the period from 2012 to 2020 with 2529 observations (. I need help dealing with missing data for variables (dependent, independent, control and dummy). My panel data is unbalanced and sometimes I find the missing data, in the beginning, middle and end. How can I solve this issue, please?

    xtset id year
    panel variable: id (unbalanced)
    time variable: year, 2012 to 2020, but with gaps
    delta: 1 unit

    there are missing value in some variables as shown below
    id year W_ROA CEO_DUALITY YEAR OF INC
    17 2012 9.84 . 2007
    17 2013 7.66 . 2007
    17 2014 13.3 . 2007
    17 2015 . . 2007
    17 2016 . . 2007
    17 2017 9.78 . 2007
    17 2018 15.25 . 2007
    17 2019 11.78 . 2007
    17 2020 13.87 . 2007
    18 2012 . . 2015
    18 2013 . . 2015
    18 2014 . . 2015
    18 2015 . . 2015
    18 2016 . 0 2015
    18 2017 2.81 0 2015
    18 2018 3.25 0 2015
    18 2019 3.65 1 2015
    18 2020 3.97 1 2015

    I would like to ask how can I fill in the missing values (should I take into account the date of incorporation).




  • #2
    Hi you can use Stata interpolate ipolate option to fill in the missing values
    Please see this file
    https://www.stata.com/manuals/dipolate.pdf

    Comment


    • #3
      Huda:
      even though panel attrition is frequent in longitudinal studies, you should investigate whether the missingness is informative or not before considering any interpolation/imputation strategy..
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thanks Muhammad and Carlo.
        I know the interpolation will solve the problem, but my question is how can I use it only if there are gaps in the middle or in the end ,some firms were established after 2012 so I don't want to fill in missing values before establishing.

        Can you suggest a suitable way to solve this issue?

        Comment


        • #5
          Huda:
          the safest approach is to live with your unbalanced panel dataset.
          Obviously, this approach comes at the cost of ending up with a very small sample to use for your regression.
          That said, if some firms were established after a given year, there's nothing to interpolate or impute before that year.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            That's exactly what I mean. Can I use the following code to fill the gap after the incorporation date?
            mipolate l.roa year, by(id) gen(l.roa.i) forward

            Comment


            • #7
              Huda:
              I cannot test your code because I'm away from my desk at the moment.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment

              Working...
              X