Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dealing with missing stock price data

    Dear All,

    I have a dataset containing stock prices for many companies during 10 years. I will conduct an event study on my dataset. However I have missing stock prices for a number of trading days. The missing observations are mostly randomly distributed across the companies and across time. I am wondering how to deal with this problem in stata and what method to use. Should I for example carry forward the previous price or should I use some kind of multible imputation?

    Regards
    Anders

  • #2
    Anders:

    both LOCF and next observation carried backwards (NOCB) are widely seen as inappropriate methods for dealing with missing data.

    -ipolate- and -mi- can be valuable approaches and their entries in Stata .pdf manual good places to start.


    I would also skim through the existing literature rules in your research field to track down what Others did in the past when presented with the same problem (customary rules, you know, are usually well accepted by colleagues and reviewers).

    Most of your effort will also depend upon the informativeness of your missing data.


    You may want to increase you knowledge on that topic via some textbooks as, for example:
    Paul D. Allison. Missing Data. http://www.sagepub.in/textbooks/Book9419.
    Stef van Buuren. Flexible Imputation of Missing Data. https://www.crcpress.com/Flexible-Imputation-of-Missing-Data/van-Buuren/9781439868249;

    I woud eventually point you out to the following website http://www.missingdata.org.uk/, which is maintained by Jeremy Bartlett, whose posts appear from time to time on this forum.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo gives good advice, and indeed understates the scope for interpolation. See e.g. mipolate (SSC).

      But for most statistical purposes, it is simplest to do nothing. If you interpolate your results, you don't increase the data, and so e.g. degrees of freedom and P-values will be inflated without meaning.

      Most crucially, why do the missings arise? Does it mean that there were no transactions on those days? Seeming random in incidence is not proof of randomness in incidence.

      Comment


      • #4
        Thank you Carlo Lazzaro and Nick Cox.

        In most cases I believe that there were no transactions on those days. I just found out that there are generally more missing values for stocks with low trading volume. If i drop the stocks with missing values the study will be biased to high-volume stocks. Do you still think it is better to drop the observations or should I use -ipolate- or mi- as Carlos suggests?

        Comment


        • #5
          Everything depends on what you want to do. What you've told us so far is an event study.

          Most Stata commands would just ignore the missing values any way, so that's one way forward.

          Comment


          • #6
            Anders Karlsson : I am currently working on a similar project. With missing observations keep in mind that something could be fishy (in particular assume missing data points right after your event, this could mean a lot and interpolating will most likly result in completly wrong results!).

            I tracked down those cases manually to make sure what happened there. You may want to at least check arround your event date case-by-case if you choose to interpolate in general.
            Last edited by Jannic Cutura; 13 Apr 2016, 08:26.

            Comment


            • #7
              Anders:
              the fact that you report more missing values for stocks with low trading value might be a clue for informative missingness in your data.
              Taking all the previous helpful advice into account, I would however delve into this issue first.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Thank you for all input! I will take this into account in my study

                Comment

                Working...
                X