Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • log-dummy variables

    Hi everybody,

    I would like to run a regression with following variables:

    ++++++
    reg logquantity logprice logPriceCustomersegment1 logPriceCustomersegment2 logPriceAdvertisment1 logPriceAdvertisment2 logPriceMarket1 logPriceMarket2 logPriceMarket3 logPriceMarket4 logPriceMarket5 i.season …..
    +++++

    I have more or less 5 dummy variables transformed in a log variable with 3 – 6 expressions for each dummy variable, see above. I already know that I have to drop one of the expressions per dummy variable to avoid collinearity.

    Now my problem is that I would like to run regressions for thousands of different articles, but often my dummy variables are omitted because of collinearity. Is there a solution how to avoid collinearity? Or is there a command like the “i.” for log-dummy-variables.

    Thank you so much.

  • #2
    You can't take logarithms of dummy variables. A dummy variable is coded as 0 or 1. The log of 0 is undefined (missing value). The log of 1 is 0. So when you enter logdummy into your model, the observations where the dummy is zero are omitted, and in all the remaining observations logdummy = 0 and is therefore colinear with the constant in the model.

    Why are you even thinking of taking logs of dummy variables? What are you actually trying to accomplish? Perhaps if you spell that out, someone will come up with a way to achieve that goal.

    Comment


    • #3
      For example:
      I had the variable price and customersegment_number. I transformed "price" into "logprice" and I generated a dummy variable for "customersegement_number". To get interaction effects I pooled logPrice*Customersegment1 and so on.

      Now I would like to run regressions, but the problem for some observations is collinearity.

      Comment


      • #4
        Tina:
        as an aside to Clyde's helpful insight, I suspect your regression model is at risk of endogeneity: other things being equal, -logquantity- can contribute to explain variation in -logprice- as well as -logprice- can contribute to explain variation in -logquantity-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          In #1 you said you had log-transformed your dummy variables, but in #3 it appears you did not. I'm completely confused. I think you need to show us a sample of your data as well as the complete exact code you used to create your regression variables, and the regression command and output, too. Do read FAQ #12 for instructions on how to most helpfully post example data (-dataex-) and code/output (code delimiters).

          Comment

          Working...
          X