Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • drop variables or use dummies?

    Good morning,

    I have been running an analysis of the determints of the maturity structure for a bunch of firms using quantile regression. I have been told to remove part of the sample because those companies show a odd behavior. The thing is that removing them from the sample will reduce the number of observations in a 30%, so I would rather not to do it. My question is, can I use dummies to control for this companies without removing their observations?

    Thanks in advance

  • #2
    Marcos:
    yes, using dummies sounds wise.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      I would actually be a bit more reluctant here. Both sound like purely technical, theoretically blind "solutions". If 30 % of your sample - and, if representative, therefore the population your statistical inference will be about - behaves "odd", I would spend some time thinking about what exactly "odd" means and try to find a theoretical explanation. Taking it from there, I would think about how this could be modeled appropriately.

      By merely including an indicator for "odd" companies the coefficients of the remaining predictors are still forced to be the same for both groups of companies. With little information about what "odd" really means and why it occurs, I think this restriction is a pretty strong assumption.

      Best
      Daniel

      Comment


      • #4
        Daniel is right.
        My reply was actually constrained by choosing the lesser evil between dropping and including a dummy.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          In this context the odd behavior refers to a different pattern in the evolution of the average maturity of debt that is my dependent variable. Therefore I have been told to remove these kind of companies but it would reduce the sample sharply. This odd behavior happens mostly in small and low indebted companies. So, maybe I can control for this issue with another kind of variable, I do not know. My first idea was to use dummies but i am opened to suggestions.

          Regards

          Comment


          • #6
            Unfortunately not my field of interest at all, but the direction you are now going seems much better. If small companies tend to behave differently, then including size of company as a predictor probably makes more sense than an indicator. You can then think about why size does matter and whether the effect of other predictors might depend on size, too. Details depend much on the specific research question you are trying to answer here. As I said, I cannot contribute much on this, as I am not an economist.

            Best
            Daniel

            Comment


            • #7
              I am not an economist either, but I would tentatively suggest log size not size as a predictor.

              Comment


              • #8
                In addition to the insightful comments, I would suggest having both results to check whether excluding the observations results in a change in inference. A discussion of what you refer to as "odd behavior" should be included to justify the reduced sample, and whether or not this affects your results. It is very common in economic papers too see an added section where various robustness checks are included.

                Comment


                • #9
                  I am also not an economist or finance specialist, but I just want to reinforce from a slightly different perspective what Daniel Klein has been saying here.

                  One of the worst things one can do in developing models is to throw out data based on the outcome variable behaving "oddly" in some way that cannot be explained with real predictor variables. By real predictor variables, I mean predictor variables whose values can be ascertained without knowledge of the outcome variable. This is because if one does that, the result is a model whose domain of applicability is unknowable. That is, confronted with a new firm or even the same firms in the future, the model can be run and the only prediction will be "this is what will happen,... unless it doesn't." Adding a variable to the model that indicates the observations with "odd" outcomes doesn't help, because nobody can know which way to set that variable for the new case until after the outcome has been observed! So the model can predict nothing.

                  Comment


                  • #10
                    Marcos:
                    I would also search for a literature support in your research field to ascertain if that "odd beahaviour" (whatever that means) is not the result of something that is clearly rooted in theory and can be easily included in a regression model.
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment

                    Working...
                    X