Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multicollinearity - Lagged Independent Variable

    I am assessing the impact of economic sanctions on the five major components of GDP (imports, exports, consumption, expenditure, and investment) using a fixed effects model for panel data.

    I've classified different sanctions levels into four dummy variables: threats, limited, moderate, and extensive.

    I've included a number of others controls, such as L.Log(GDPpc), L.Log(Population), democracy, conflict, etc.

    Issue:

    I created a series of dummy to represent a country's openness to trade (Exports + Imports)/GDP. I then classified these into four levels of openness based on their quartile: low, medium, high, very high. This variable is lagged.

    Question: When I am investigating the effect of economic sanctions on imports and exports, would it be inappropriate to include the lagged trade openness variable ((Exports + Imports)/GDP), since this is a component of imports and exports? Stata shows these variables to be statistically significant.

    Thank you,

    Ryan

  • #2
    There is no issue having trade openness on the right hand side when your outcome is either imports or exports. It will not necessarily be correlated to either and you will find that such an inclusion is quite common in empirical macroeconomics. My only question would be why are you categorizing the variable? You are throwing away information by doing so.

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      There is no issue having trade openness on the right hand side when your outcome is either imports or exports. It will not necessarily be correlated to either and you will find that such an inclusion is quite common in empirical macroeconomics. My only question would be why are you categorizing the variable? You are throwing away information by doing so.
      Thank you for your reply.

      But according to how I defined trade openness (imports, exports, GDP), naturally these would greatly explain my dependent variable (either imports or exports) since my dependent variable is a component of the independent variable. Wouldn't that form reverse causality issues? Note: In the Stata models I ran, "medium trade openness" was found to statistically insignificant while the other 3 were statistically significant.

      As to your second point, I agree, I could just leave the trade openness defined as ((Imports / Exports)/GDP) rather than categorize them into dummy variables. Would make the estimates far more accurate.

      Comment


      • #4
        But according to how I defined trade openness (imports, exports, GDP), naturally these would greatly explain my dependent variable (either imports or exports) since my dependent variable is a component of the independent variable.
        Your definition of trade openness is the standard definition used in the literature (sum of imports and exports as a share of GDP). I have already answered your question in #2, but maybe a simple example can help you understand the logic of the argument. Consider a country that has large variability in both imports and exports, but trade openness is relatively constant.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input float(imports exports GDP)
        1000 1000 10000
        1200  800 10030
         800 1200 10060
         600 1400 10090
        1500  500 10120
         700 1300 10150
         800 1200 10180
         550 1450 10220
        1050  950 10250
        1200  800 10280
        end
        
        gen openness= (imports+ exports)/GDP
        corr imports openness
        This yields a correlation of almost 0, implying that information on either imports or exports can not help you predict trade openness.

        Code:
        . corr imports openness
        (obs=10)
        
                     |  imports openness
        -------------+------------------
             imports |   1.0000
            openness |   0.0438   1.0000
        This is an extreme example, but it serves to illustrate the point. Granted, if you only have countries where either imports or exports dominate the trade balance, then having information on the dominant component is sufficient to predict trade openness. The scaling variable, GDP, for most countries, changes little year by year. As long as your sample is not composed of a few countries where either imports or exports dominate, you should be fine. A colleague of mine and I have compiled a data set of 131 countries for the period 1984-2016 which includes trade openness, imports and exports as well as other macroeconomic variables, available here. You can verify for yourself that for such a large group of countries, the correlation between openness and either imports or exports does not preclude one from combining them in a model.
        Last edited by Andrew Musau; 06 Dec 2018, 05:44.

        Comment


        • #5
          Thank you very much for your reply, Andrew! After speaking with my professor I decided to include this variable in my model.

          ----

          I'm running the following model testing whether sanctions issued by the United States are significantly different from the those issued by other countries.

          US vs. Non-US

          xtreg logimports threat limited moderate extensive threatusa limitedusa moderateusa extensiveusa L.loggdppercapita L.logpopulation minorconflict majorconflict tradeopenness i.year, fe cluster(countryid)

          Where:

          threat = if any country (including USA) issued a sanction threat against target
          limited = any country issued a limited sanction in severity against target
          moderate = Same as limited but more severe
          extensive = Same as moderate buy more severe

          threatusa = If the United States made a threat against the target nation
          limitedusa = repeat of limited except for USA
          moderateusa = repeat of moderate except for USA
          extensiveusa = repeat of moderate except for USA

          The rest of the variables are various control variables.


          Question: Is it correct to include both the world sanctions with the USA sanctions? The issue I'm contemplating is that if the USA issues a threat against another country, both the -threat- and -threatusa- dummy variables would take a value of 1. This is because -threat- includes both all countries AND the USA. Do I need to differentiate between these variables? Should the -threat- variable include all countries EXCEPT the USA, while the threatusa includes ONLY the USA?

          Comment


          • #6
            My apologies Ryan Taylor , I completely missed this question.

            Question: Is it correct to include both the world sanctions with the USA sanctions? The issue I'm contemplating is that if the USA issues a threat against another country, both the -threat- and -threatusa- dummy variables would take a value of 1. This is because -threat- includes both all countries AND the USA. Do I need to differentiate between these variables? Should the -threat- variable include all countries EXCEPT the USA, while the threatusa includes ONLY the USA?

            No, you should not change the threat variable. The standard way to do what you want is through adding interactions to the model, but note that the variable that you call "threatusa" is simply an interaction between the variable threat and an indicator for the United States. You can verify this as follows:

            Code:
            gen interaction = threat*usa 
            compare interaction threatusa
            where usa above takes the value 1 if the country is United States and zero otherwise. So that the model is not misspecified, make sure that you also include an indicator for usa in the regression. The most efficient way to do this is by using factor variables

            Code:
            xtreg logimports c.threat##c.usa ...

            Comment

            Working...
            X