Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • First differences implementation

    Hello everybody, I have a question about my stata first differences implementation.

    I have 3 time periods and 381 unit observations.

    My code is the following:

    xtset id year

    gen dy1 = d.y1

    gen dx1 = d.x1

    I repeat the same process for all my control variables. After that i run my code.

    reg dy1 dx1 dx2....dxt i.id i.year, noconstant

    I run i.id and i.year in order to control for area and time fixed effects (my unit of observations are different areas)

    My question is if this is correct? Also I have some missing observations in control variables so I dont know if I should do something about it.
    Thank you




  • #2
    Ivan:
    why not using -xtreg,fe-?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      1. Accounting for missing information is a very good idea. I suggest MICE, see: https://stats.oarc.ucla.edu/stata/se...stata_pt1_new/
      2. Carlo raised an important point. What are your arguments for FD? The differences between FD and FE can be subtle, this source discusses some implications you should be aware of: https://economics.stackexchange.com/...rst-difference
      3. In general your approach is fine, however, there are some things I would change, in general, you can simplify your command as such:
      Code:
      xtset id year
      reg d.y d.x1 d.x2 d.x3 i.year, nocons vce(robust)
      I would omit the panel indicator(i.id), I wonder if the command even computes when including it. Whether or not to include the year FE is also something I am not sure about.
      Best wishes

      Stata 18.0 MP | ORCID | Google Scholar

      Comment


      • #4
        Ivan:
        in addition, please not that, unlike what happens with -xtreg-, -robust- and -vce(cluster clusterid)- options do a different job with -regress-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thank you for reply. Carlo reason why I am using FD instead of FE is because simply put I am replicating a research paper and assumptions that are used by the author are valid also for my case. However, both of your points are valid, since I have small amount of time periods. I guess it will be a limitation of my paper.

          Comment


          • #6
            Also, I would like to say that once I include i.year, coefficient for my main explanatory variable changes sign. Is this unusual or it is completely normal?

            Comment


            • #7
              I wouldn't say that it is unusual, nor would I call it completely normal. It is a situation that doesn't arise very often, but sometimes does. It happens when the within-time effect of your variable is opposite to the between-time effect. Here's an example, and the plot of the data at the end makes it clear what's going on.

              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input float(time x y)
              1  1  17.67043
              1  2 15.075164
              1  3 13.394225
              1  4  10.96041
              1  5  7.820862
              2  6  39.78113
              2  7  37.63985
              2  8  32.94478
              2  9 32.163788
              2 10 25.713106
              3 11  60.33619
              3 12  58.55099
              3 13  57.55923
              3 14  51.42798
              3 15  50.41329
              4 16  76.74454
              4 17  72.08629
              4 18  72.17093
              4 19  70.47061
              4 20 66.777596
              5 21  97.34257
              5 22  96.00155
              5 23  94.79646
              5 24  90.47357
              5 25  87.11472
              end
              
              regress y x
              regress y x i.time
              
              graph twoway scatter y x, msym(i) mlab(time)

              Comment


              • #8
                Thank you Clyde, I did the graph for my data, and although it is not as evident as your example, I can see a small negative relationship.

                Comment

                Working...
                X