Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Analysis of Log Level Regression

    I am running OLS regression with a logged dependent variable (llta) - this is a log of terrorist attacks per country by year. My independent variable (hu2) is a binary variable - 0 = no intervention, 1 = intervention. Then I have 7 controls.

    I have read online that because my dependent variable is logged, I am running log-level regression. The coefficient for hu2 is 4.521045. I have read online that I multiply this by 100 to get 452. Does this mean that the levels of terrorism increase by 452 percent with intervention?

    When I get to the controls, lgdpc and lpop (gdp per capita and population) are logged. Do I run log-log regression analysis in these instances? For example, the coefficient for lgdpc is 3.375022. Does this mean that a 1 percent increase in gdp per capita results in a 3.4 percent increase in levels of terrorism?

    I have attached the results.
    Attached Files

  • #2
    Dear Ally,

    You are interested in the number of terrorist attacks, which is a count. Did you consider using a count data model? The way you are doing it you lose all observations with zero attacks and that may severely bias your results.

    All the best,

    Joao

    Comment


    • #3
      When I logged the variable, I did gen llta = log(ta+0.00001) to ensure against a loss of zeros. I logged the variable because the data was positively skewed. Is that incorrect? I am unfamiliar with the process of analysing logged variables in OLS.

      Comment


      • #4
        Dear Ally,

        The trick you used is not really appropriate (but very popular!). Since you have count data, you really should be using a count data model; these models are designed to deal with positively skewed data with many zeros.

        All the best,

        Joao

        Comment


        • #5
          I am analysing the effect of intervention on terrorism and conflict. I'm interested in the trend more than specifying the number of attacks. I am also analysing the effect upon battle deaths. Both are extremely skewed and there is an issue of heteroskedacity on one of the models when run. My superviser advised logging the variable, and the results have been significantly better as a result. When running as a count (OLS and nbreg) there results are completely different. Does logging inflate the effect of the variables?

          Comment


          • #6
            Dear Ally,

            Taking logs (especially after adding a constant because of the zeros) can severely distort the results. Using NB or Poisson regression would be a far better alternative, but that is something you need to discuss with your supervisor. Using plain OLS on the original data is likely to also lead to unreliable results.

            All the best,

            Joao

            Comment


            • #7
              Note that the same issues are being pursued in two threads.
              http://www.statalist.org/forums/foru...og-my-variable

              At least you are getting firm and consistent advice here not to do what your supervisor is apparently recommending.
              Last edited by Nick Cox; 18 Aug 2015, 07:59.

              Comment

              Working...
              X