First differences implementation

Ivan Ivanov

Join Date: Sep 2024

Posts: 11
#1

First differences implementation

09 Sep 2025, 08:47

Hello everybody, I have a question about my stata first differences implementation.

I have 3 time periods and 381 unit observations.

My code is the following:

xtset id year

gen dy1 = d.y1

gen dx1 = d.x1

I repeat the same process for all my control variables. After that i run my code.

reg dy1 dx1 dx2....dxt i.id i.year, noconstant

I run i.id and i.year in order to control for area and time fixed effects (my unit of observations are different areas)

My question is if this is correct? Also I have some missing observations in control variables so I dont know if I should do something about it.
Thank you
Tags: firstdifference
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17748
#2

10 Sep 2025, 00:57

Ivan:
why not using -xtreg,fe-?

Kind regards,
Carlo
(Stata 19.0)
Comment
Felix Bittmann

Join Date: Aug 2018

Posts: 766
#3

10 Sep 2025, 02:16

1. Accounting for missing information is a very good idea. I suggest MICE, see: https://stats.oarc.ucla.edu/stata/se...stata_pt1_new/
2. Carlo raised an important point. What are your arguments for FD? The differences between FD and FE can be subtle, this source discusses some implications you should be aware of: https://economics.stackexchange.com/...rst-difference
3. In general your approach is fine, however, there are some things I would change, in general, you can simplify your command as such:

Code:

xtset id year reg d.y d.x1 d.x2 d.x3 i.year, nocons vce(robust)

I would omit the panel indicator(i.id), I wonder if the command even computes when including it. Whether or not to include the year FE is also something I am not sure about.

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17748
#4

10 Sep 2025, 07:27

Ivan:
in addition, please not that, unlike what happens with -xtreg-, -robust- and -vce(cluster clusterid)- options do a different job with -regress-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Ivan Ivanov

Join Date: Sep 2024

Posts: 11
#5

10 Sep 2025, 10:32

Thank you for reply. Carlo reason why I am using FD instead of FE is because simply put I am replicating a research paper and assumptions that are used by the author are valid also for my case. However, both of your points are valid, since I have small amount of time periods. I guess it will be a limitation of my paper.
Comment
Ivan Ivanov

Join Date: Sep 2024

Posts: 11
#6

10 Sep 2025, 10:34

Also, I would like to say that once I include i.year, coefficient for my main explanatory variable changes sign. Is this unusual or it is completely normal?
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30189

10 Sep 2025, 11:12

I wouldn't say that it is unusual, nor would I call it completely normal. It is a situation that doesn't arise very often, but sometimes does. It happens when the within-time effect of your variable is opposite to the between-time effect. Here's an example, and the plot of the data at the end makes it clear what's going on.

Code:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(time x y)
1  1  17.67043
1  2 15.075164
1  3 13.394225
1  4  10.96041
1  5  7.820862
2  6  39.78113
2  7  37.63985
2  8  32.94478
2  9 32.163788
2 10 25.713106
3 11  60.33619
3 12  58.55099
3 13  57.55923
3 14  51.42798
3 15  50.41329
4 16  76.74454
4 17  72.08629
4 18  72.17093
4 19  70.47061
4 20 66.777596
5 21  97.34257
5 22  96.00155
5 23  94.79646
5 24  90.47357
5 25  87.11472
end

regress y x
regress y x i.time

graph twoway scatter y x, msym(i) mlab(time)

Comment

Ivan Ivanov

Join Date: Sep 2024

Posts: 11
#8

10 Sep 2025, 19:12

Thank you Clyde, I did the graph for my data, and although it is not as evident as your example, I can see a small negative relationship.
Comment

Announcement

First differences implementation

Comment

Comment

Comment

Comment

Comment

Comment

Comment