Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why am I experiencing high P-Value and positive correlation on this?

    Click image for larger version

Name:	2018-09-28 21_33_08-2 - Stata_IC 15.1.png
Views:	2
Size:	34.3 KB
ID:	1463974


    I'm a bit confused about my dummy variable results. I am running a regression on log GDP. the three dummies are for years with recession. Recession 2 and 3 are positively correlated with GDP, which doesn't make sense and my P-Values are huge. Can someone give me some thoughts on what my error might be and how to resolve it?

  • #2
    You shared a snapshot. Please read the FAQ, particularly the topic about sharing data/command/output. There you will also find the recommendation to avoid snapshots.

    That being said, the command seems to be mistyped, for there is a space between the factor notation and the first dummy concerning recession.

    I am also wondering whether the factor notation couldn't be used to encompass all three dummies at once.
    Best regards,

    Marcos

    Comment


    • #3
      From what I see with your first recession, your indicator variables might all be coded 0/1, and if so this won't make much difference, but you could try
      Code:
      xtreg lGDP lmilitaryspending lcapitalformation i.(recession1 recession2 recession3), fe
      or
      Code:
      xtreg lGDP lmilitaryspending lcapitalformation i.recession?, fe
      You might also want to look at
      Code:
      regress lGDP lmilitaryspending lcapitalformation i.(recession1 recession2 recession3 id)
      estat vif

      Comment


      • #4
        Bob:
        as an aside to previous helpful replies, at a first glance you are probably experiencing a quasi-extreme multicollinearity issue with your data.
        In the same fashion of Marcos and Joseph, I would also recommend you to get rid of creating categorical variables and/or interaction by hand and be aware of the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_..._(statistics): using -fvvarlist- will improve your way of coding and eliminate the risk of incurring in dummy trap pitfall.
        Eventually, you are seemingly dealing with a T>N panel dataset; if that were the case, take a look at -xtgls-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Originally posted by Marcos Almeida View Post

          I am also wondering whether the factor notation couldn't be used to encompass all three dummies at once.
          Hi marco,

          thanks for the reply. I did include it as well. I did gen recession= (Year==2001| Year==2008|Year==2009) to create one dummy variable. the coefficient was -.002782 which is in the right direction. However, the p-value still remained 0.953 as statistically insignificant.

          Comment


          • #6
            Please note our policy on cross-posting, which is that you are asked to tell us about it. This is spelled out in the FAQ Advice all are asked to read before posting.

            I allude to a concurrent thread on Reddit, which has featured some pertinent comments.

            Comment


            • #7
              Hi marco,

              thanks for the reply. I did include it as well. I did gen recession= (Year==2001| Year==2008|Year==2009) to create one dummy variable. the coefficient was -.002782which is in the right direction. However, the p-value still remained 0.953 as statistically insignificant
              Thanks for the information. My sugestion actually concerned the use of factor notation to deal with a categorical variable instead of creating several dummies for that matter, with the pitfalls Carlo described.
              Best regards,

              Marcos

              Comment


              • #8
                I'm not an economist. But my (possibly naive) thought is that you're running a regression on country-year data, without controlling for year. If you had no dummies for recession, your constant would represent the mean log GDP over each country's time series (with military spending and capital formation set to 0). With the recession dummies, you essentially find that the adjusted log GDP in years 2008 and 2009 is not distinguishably different from the mean log GDP across the entire time series. I can't quite explain why, but I suspect if you add year to your regression as a continuous variable, you could see something different.
                Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                Comment


                • #9
                  I can imagine a variety of problem with this model. If the data are not in constant dollars, then you may simply be showing that inflation matters. Alternatively, it is possible that high GDP facilitates military spending instead of the opposite. I'd also worry about the recession variables - they're generated probably based on change in GDP or something very close so including them on the rhs may be problematic in a model explaining GDP.

                  Comment

                  Working...
                  X