-xtset panel time-: What does the time variable do?

paulvonhippel

Join Date: Apr 2014
Posts: 502

-xtset panel time-: What does the time variable do?

08 Nov 2021, 13:43

I don't understand what the time variable does in -xtset panel time-. I first thought that it incorporated time fixed effects into models using -xtreg, fe-, but it appears it does not. So what is it for? Here is an example

Code:

xtset, clear
xtset idcode

/* Here's a model with fixed effects for individual and year */
xtreg ln_w tenure age i.year, fe cluster(idcode)
/* That's fine, but we have a coefficient for each year, which we don't really need to see. */
/* Is it possible to incorporate the year fixed effects implicitly, like this? */
xtset, clear
xtset idcode year
xtreg ln_w tenure age, fe cluster(idcode)
/* Nope, these estimates are different than the previous set. They must not include year fixed effects. */
/* Here's the model with year fixed effects again. */
xtreg ln_w tenure age i.year, fe cluster(idcode)

So what difference to the estimates did -xtset idcode year- make? Is it possible it would make a bigger difference if the rows were not ordered by year within idcode?

Thanks for any clarification!
Paul

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

08 Nov 2021, 14:01

No, as you have noted, -xtset- with a time variable does not incorporate time fixed effects into the model. You have to do that explicitly with i.time in the varlist of the regression.

Inclusion of a time variable with -xtset- has no effect at all on the kind of regressions you show here. Where it makes a difference is

1. If you do a regression that incorporates autoregressive structure into the residuals: it lets Stata know what variable to look at to figure out the right chronological sequence and intervals.

2. If you use time-series operators (-help tsvarlist-) such as lags, leads, differences, or seasonal differences, it lets Stata know what variable identifies chronological order and time intervals.

The other "side effect" benefit of -xtset- with a time variable is that it will verify that the panel id and time variable uniquely identify observations in the data set. Otherwise put, it verifies that you truly have panel data. Many data sets that people expect to be bona fide panel data are large data sets that contain errors such as duplicate entries, or, worse, conflicting entries for the same panel in the same time period. Such errors invalidate the calculations done with them (even if no time series operators or autoregressive structure is involved) but go undetected if you just -xtset panelvar- without the time variable. By contrast, if you -xtset panelvar timevar- Stata will check on this and alert you if your data is not what you think it is, so you can find and fix the problem before somebody else finds it for you and embarrasses you about your incorrect results! That's why I usually incorporate a time variable when I -xtset- panel data even if I don't "need" it.
2 likes
Comment
paulvonhippel

Join Date: Apr 2014

Posts: 502
#3

08 Nov 2021, 15:56

So, for example, if you're clustering the standard errors, having a time variable will help Stata to correctly model the serial correlation in the residuals?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#4

08 Nov 2021, 17:23

No. You can cluster the standard errors without setting a time variable. Cluster robust standard errors protect against heteroskedasticity and intra-panel correlation of residuals. But not specifically serial correlation.
Comment

Raymond Portocarero

Join Date: Nov 2021
Posts: 8

10 Nov 2021, 05:45

Dear participants,

My dataset look at individual firms Scorechanges (SC) over the years whereby each firm incurred the scorechange somewhere from 2016-2020. Therefore, I have repeated time values in my sample data. I provide an example below.

Code:

* Example generated by -dataex-. To install: ssc install dataex

 Firm_    Firm             Year   SC Own  Skills   PP ED
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "            2018  3  5.28  4.73 15.18 11
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
1 "ADVTECH LTD "               .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD " 2017  5  18.7  12.1 16.83 11
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
2 "AFRICAN OXYGEN LTD ORD "    .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"    2016 -3     0 11.66  16.5  0
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
3 "ARCELORMITTAL SA LTD"       .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "  2017  2 17.49  9.02 15.29 11
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
4 "ARGENT INDUSTRIAL LTD "     .  .     .     .     .  .
5 "ASPEN PHARMACARE HLDGS."    .  .     .     .     .  .
5 "ASPEN PHARMACARE HLDGS."    .  .     .     .     .  .
5 "ASPEN PHARMACARE HLDGS."    .  .     .     .     .  .
5 "ASPEN PHARMACARE HLDGS."    .  .     .     .     .  .
end

I would like to regress Scorechange (SC) on Own Skills PP ED
Code:

Code:

regress SC Own Skills PP ED

=> No problem, regression results show up. However, as I have individual firms and firms incurring the Scorechange (SC) in different years, I should control for fixed effects with respect to firms and years.

Code:

Code:

regress SC Own Skills PP ED, fe vce (cluster, firm_id)

or
Code:

Code:

 regress SC Own Skills PP ED, fe vce (cluster, Year)

=> STATA tells me "must first specify panelvar; use xtset".

However,
Code:

Code:

xtset SC year

Yields in "repeating time values in the sample". How should I correctly state my panel data so that I can run the regression with fixed effects?

Thank you!!

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#6

10 Nov 2021, 14:14

Since you chose to post this tangentially related question on this thread, do read my response to #2 carefully. You will see there that for the kind of regression you want to do, you do not need to specify a time variable in your -xtset- command. You only need to -xtset _Firm- for your purposes. You should not use SC as the panel variable in your -xtset- command: it makes no sense to do that. Nor should you use Year as the variable for clustering your standard errors unless there are some unusual circumstances that justify doing that. Usually we are concerned about intra-firm correlation, not intra-year, for this purpose.

Your commands should be something like:

Code:

xtset _Firm // USE THE NUMERIC FIRM VARIABLE regress SC Own Skills PP ED i.Year, fe vce(cluster _Firm)

Yes, you have repeating time values in the sample, though in your example data, the repeated values are actually missing values. But that's a problem in its own right. Within each firm's data there is only one observation that consists of any actual non-missing data. So all of your observations in the estimation will be singletons, which means there will be no usable data in a fixed-effects regression and you will get an error message telling you that the regression cannot be done. You must have multiple observations per firm in order to do a fixed-effects regression.

Moreover, why do you have so many observations with no data? This usually is indicative of really bad data management creating the data set in the first place, making the data set unusable and untrustworthy.

Finally, you did some kind of editing to the -dataex- output be fore you posted it. As a result it was unusable. NEVER EDIT THE -dataex- OUTPUT. Always post it exactly as it comes in the Results window.
1 like
Comment
Fei Wang

Join Date: Oct 2021

Posts: 726
#7

10 Nov 2021, 20:06

A little thought on #5. A record appears only when a firm experienced a score change, and missing observations may represent constant scores (or no score changes). If this is the case, you may want to fill all missing SC with zeros, indicating zero score changes. If at the same time all the other variables can be filled, then you would be able to apply panel-specific approaches. For now, the data is essentially a pooled cross-sectional data, and what you're able to do is no more than the following estimation.

Code:

regress SC Own Skills PP ED i.Year, vce(cluster firmid)
1 like
Comment

Raymond Portocarero

Join Date: Nov 2021
Posts: 8

11 Nov 2021, 02:37

Dear sirs,

Thank you for your answers and recommendations. Let me provide some more content.

Each firm has a score observation from 2009 until 2021 from 1-8. I calculated yearly scorechanges. In 2015, a new law was introduced. I am looking into the scorechange of the first score received under the new law. As came out of the sampled ttest results, the scorechange from the old to the new law was significantly different from the changes in the scores in the previous years. Scores are based on the 4 specified elements (Own, Skills, PP, ED). I listed the avarege values for these elements for each firm (equal weighting).

I am interested to find out whether scoring high/low on these elements had an impact on the resulting score shock. Therefore, only the scorechange in the first year of the new law is listed. The others are not "missing" observations. I simply edited the file for regression purposes. Firms adopted the new law in different years (some already in 2015, others in 2019). As each firm has observations from 2009 until 2021 and some firms naturally adopted the new law in the same year (e.g 2016), I have repeated time values in the sample.

Below an unedited (thank you) example of the dataset.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte fid str50 firm int year byte(score scorechange) float(Own SD PP ED)
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . 6  .    .     .     .     .
1 "ADVTECH LTD "            2018 9  3 5.28  4.73 15.18 14.96
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
1 "ADVTECH LTD "               . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . 3  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD " 2017 8  5 18.7  12.1 16.83 14.96
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
2 "AFRICAN OXYGEN LTD ORD "    . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . 6  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"    2016 3 -3    0 11.66  16.5     0
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
3 "ARCELORMITTAL SA LTD"       . .  .    .     .     .     .
end

When

Code:

 xtset fid

and subsequently

Code:

 regress scorechange Own SD PP ED, fe vce (cluster, fid)

The regression works. Much appreciated.

However, when coding

Code:

 regress scorechange Own SD PP ED i.year, fe vce (cluster, fid)

I do not receive an F-statistic in my output. More than interested in hearing further recommendations.

Thank you

Last edited by Raymond Portocarero; 11 Nov 2021, 03:01.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#9

11 Nov 2021, 03:14

Raymond:
1) there's no need to -xtset- your data before -regress- (whereas it is mandatory before -xtreg- and many -xt- suite commands);
2) the option -vce(cluster, fid) is not expected to work (due to a comma in excess), as it should be:

Code:

vce(cluster fid)

;
3) as far as the missing F-statistic is concerned, please see -help j_robustsingular-.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Raymond Portocarero

Join Date: Nov 2021

Posts: 8
#10

15 Nov 2021, 06:10

Dear Mr. Lazzaro,

Thank you for your answer and recommendation. Given my dataset, which model do you think is the most adequate?

1) Simple lineair regression

Code:

regress Scorechange Own SD PP ED

=> R-squared= 0.22, R-squared adjusted 0.17, Ownership (Own) coefficient =-0.12 significant at the 1% level.

2) Regression with clustering standard errors at individual firm-level.

Code:

regress Scorechange Own SD PP ED, vce (cluster fid)

=> R-squared 0.1, Ownership (Own) coefficient =-0.09, significant at the 5% level.

3) Regression with clustering standard errors at individual firm-level and fixed year effects.

Code:

regress Scorechange Own SD PP ED i.Year, vce (cluster fid)

=> R-squared 0.17, Ownership (Own) coefficient = -0.09, significant at the 5% level. The year effects are all significant at the 1% level and there does not appear to be a contrast.

Regards
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#11

15 Nov 2021, 06:29

Raymond:
I would go with 3) (with a bit of guess-work, I must admit, as you do not post the complete outcome tables of the three regression models, as you should follwing the FAQ. Thanks).
That said, I would also test whether (or not) regression #3) is correctly specified (see -linktest-).
As an aside, please call me Carlo, like all on (and many more off) this forum do. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment

Raymond Portocarero

Join Date: Nov 2021
Posts: 8

#12

15 Nov 2021, 07:44

Dear Carlo, noted! Thank you for the advice. I have added the 3 models below. Interested in hearing your thoughts. I will also dive deeper into linktest. Lastly, I believe the scorechange is systematic for all firms. However, firms that were previously invested in ownership equity have experienced a smaller shock (but still a shock). Therefore, the model is not a model for estimating all coefficients on scorechange, but only trying to explain that the ownership element explains part of the puzzle in reduced shock observations.
Model 1: Simple lineair regression

scorechange	Coef.		St.Err.	t-value		p-value	[95% Conf		Interval]	Sig
Own	-.084		.044	-1.92		.059	-.171		.003	*
SD	.011		.076	0.15		.884	-.141		.164
PP	-.042		.098	-0.43		.666	-.237		.152
ED	.032		.066	0.48		.633	-.1		.163
Constant	3.129		1.525	2.05		.044	.087		6.171	**

Mean dependent var		1.680			SD dependent var			2.035
R-squared Adj R-squared		0.064 0.11			Number of obs			75
F-test		1.202			Prob > F			0.318
Akaike crit. (AIC)		323.393			Bayesian crit. (BIC)			334.980
* p<.01, p<.05, * p<.1

Model 2: Lineair regression with standard errors clustered at the individual firm level

scorechange	Coef.		St.Err.	t-value		p-value	[95% Conf		Interval]	Sig
Own	-.092		.046	-2.03		.046	-.183		-.002	**
SD	-.001		.088	-0.01		.995	-.177		.176
PP	-.036		.101	-0.36		.721	-.239		.166
ED	.035		.087	0.41		.685	-.138		.208
Constant	3.321		1.582	2.10		.039	.167		6.475	**

Mean dependent var		1.767			SD dependent var			1.976
R-squared		0.083			Number of obs			73
F-test		2.043			Prob > F			0.097
Akaike crit. (AIC)		309.281			Bayesian crit. (BIC)			320.733
* p<.01, p<.05, * p<.1

Model 3: Lineair regression with standard errors clustered at individual firm level and fixed-year effects

scorechange	Coef.		St.Err.	t-value		p-value	[95% Conf		Interval]	Sig
Own	-.091		.043	-2.10		.04	-.177		-.004	**
SD	-.045		.095	-0.47		.637	-.235		.145
PP	.001		.115	0.01		.99	-.227		.23
ED	.046		.085	0.54		.592	-.124		.216
			.	.		.	.		.
2016	3.036		.713	4.26		0	1.616		4.456	***
2017	3.437		.576	5.97		0	2.289		4.585	***
2018	2.187		.549	3.99		0	1.093		3.28	***
2019	3.014		.879	3.43		.001	1.261		4.766	***
Constant	.059		1.771	0.03		.974	-3.472		3.59

Mean dependent var		1.767			SD dependent var			1.976
R-squared		0.172			Number of obs			73
F-test		.			Prob > F			.
Akaike crit. (AIC)		307.786			Bayesian crit. (BIC)			326.110
* p<.01, p<.05, * p<.1

Regards

Raymond

Last edited by Raymond Portocarero; 15 Nov 2021, 08:05.

Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

#13

15 Nov 2021, 09:56

Raymond:
thanks for posting. Please use CODE delimiters (see the FAQ) in your future posts whenever you share Stata codes and outcome tables. Thanks.
That said, I'd still go Model 3: Lineair regression with standard errors clustered at individual firm level and fixed-year effects.
Please note that -linktest- is really easy to invoke as a postestimation command, as you can see in the following toy-example:

Code:

. use "C:\Program Files\Stata16\ado\base\a\auto.dta"
(1978 Automobile Data)

. regress price mpg i.foreign

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     14.07
       Model |   180261702         2  90130850.8   Prob > F        =    0.0000
    Residual |   454803695        71  6405685.84   R-squared       =    0.2838
-------------+----------------------------------   Adj R-squared   =    0.2637
       Total |   635065396        73  8699525.97   Root MSE        =    2530.9

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -294.1955   55.69172    -5.28   0.000    -405.2417   -183.1494
             |
     foreign |
    Foreign  |   1767.292    700.158     2.52   0.014     371.2169    3163.368
       _cons |   11905.42   1158.634    10.28   0.000     9595.164    14215.67
------------------------------------------------------------------------------

. linktest

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     23.95
       Model |   255815137         2   127907569   Prob > F        =    0.0000
    Residual |   379250259        71  5341552.94   R-squared       =    0.4028
-------------+----------------------------------   Adj R-squared   =    0.3860
       Total |   635065396        73  8699525.97   Root MSE        =    2311.2

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        _hat |  -2.273584   .8872812    -2.56   0.013    -4.042772   -.5043957
      _hatsq |   .0002889   .0000768     3.76   0.000     .0001357    .0004421
       _cons |   8496.863   2510.528     3.38   0.001     3491.013    13502.71
------------------------------------------------------------------------------

.

Since squared fitted values coefficient reaches statistical significance, (as expected) the regression is misspecified (put differently, predictors and/or interactions are not enough to give a fair and true view of the data generating process).

Kind regards,
Carlo
(Stata 19.0)

Announcement

-xtset panel time-: What does the time variable do?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment