Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing vaues

    Hello everyone

    I am facing an issue of too many missing values for a particular companies in panel data. For example, if my years are from 2005-2021, almost variable values for 2005-2012 are missing. In such case, imputation is not possible. Is there any other way to deal with the same.

  • #2
    I have some difficulty parsing your sentences. Lets first see if I understand your problem. You have panel data, where you observe companies over years. Some of these companies have missing values on (almost) all variables for many years.

    First thing I would look at is: did those firms exist during those years, or did the firms participate in the activities you are studying in those years, or were those firms active in the markets you were studying in those years? If that is not the case, then you are not dealing with missing values. For a value to be missing it needs to exist but you just failed to observe it. If a values just does not exist, then it cannot be missing. So companies should only be included for the years they exist. There could still be selection bias; there is probably a reason why a company chooses not to be active in a given market. Bottom line: you need to know the reason why those values are missing. Any suggestion we might make is critically dependent on that information. You probably need to do quite a bit of detective work, and guessing work, to figure this out. You can look at the documentation of the dataset, to see if it mentions those missing values. You can look at the observed values, and see if they give you a hint. You'll probably have to look outside the data, e.g. the websites of those companies.

    Also the purpose of methods for dealing with missing data, like imputation, is not to recover the missing data. If you use those methods, then you have given up on retrieving those values. Instead the purpose is to use as much of the data you did observe. So this is helpful if you have observations with missing values on some variables, but observed values on other variables. If you just ignored those observations entirely, you are ignoring the information present in those observed values for that observation. That is a waste and it may lead to bias, and that is what multiple imputation tries to solve. If many variables are missing, then there is no information worth recovering. No method can magically create information where none is available. The information has to come from somewhere, be it other variables or assumptions.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Thank you Maarten Buis . The companies do exist in those years but the missing values are for more than one variable of interest.

      Comment


      • #4
        So the question remains: why are they missing? What does more than one mean: two, some, most, all time varying variables, something else? What is the pattern? This can sometimes give you a clue on the question why the missing values occur.
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment


        • #5
          Sir Maarten Buis for a company, example Adidas operating in India all values are missing from 2005-2021. the variable is disputed taxes. This can only mean it is not reported.

          Comment


          • #6
            First you are saying that multiple variables are missing,now one variable is missing. Which is it?

            Maybe they just did not dispute the taxes in those years?
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              Multiple values are also missing but my main independent variable is disputed taxes. So the only option left is to drop the company completely.

              Comment


              • #8
                Again, that depends on why those values are missing. As long as you are not telling us that, there is nothing we can do to help you. I suspect there is something else you really should be doing, but I don't want to confuse you with recommendations based on suspicions that might be false. So I will wait till you answer my question.
                ---------------------------------
                Maarten L. Buis
                University of Konstanz
                Department of history and sociology
                box 40
                78457 Konstanz
                Germany
                http://www.maartenbuis.nl
                ---------------------------------

                Comment


                • #9
                  Anuradha:
                  couldn't it be that, due to corporate fiscal policy, taxes are are paid/disputed in the country where the company has its main headquarter?
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Yes Carlo Lazzaro that is a possibility

                    Comment

                    Working...
                    X