New user of stata

Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#1

New user of stata

27 Mar 2017, 02:04

Hallo everyone. u all can call me Din. im new in using stata software as before i used to eviews working on time series data. now i want to try panel longitudinal data and i have few question regards this software.

FYI, I im working on to find capital flight impact on eco growth of ASEAN-5 countries such; Malaysia, Thailand, Indonesia, Vietnam and Philippine.
Dependent Var is GDP growth.
Independent Variables are such; inflation rate, fdi, external debt, reer, ird

I managed to work my data to choose between fixed or random model, the hausman test suggest me to use Fixed effect model. then i proceed with autocorrelation test also heteroscedasticity test. all these tests are fine to me because i got the result.

i want to ask about on how to test which country occurs to experience highest capital flight problem between those ASEAN-5 members?? what are the commands need to perform or should I do something with the data first?
Tags: ASEAN5, capital flight, growth, individual effects

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

27 Mar 2017, 02:34

Din:
welcome to the list.
I'm not clear with what you're after but, that said, I would recommend you to use -fvvarlist- for creating a categorical variable for -country-. However, I do not think that -xtreg,fe- will give you the answer you're asking for. Countries are a time-invariant predictor and so -fe- specification (unlike -re-) will get a rid of it, as you can see from the following oversimplified toy-example:

Code:

. use "http://www.stata-press.com/data/r14/nlswork.dta", clear
(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)

. xtreg ln_wage i.race year , fe
note: 2.race omitted because of collinearity
note: 3.race omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =     28,534
Group variable: idcode                          Number of groups  =      4,711

R-sq:                                           Obs per group:
     within  = 0.1022                                         min =          1
     between = 0.0804                                         avg =        6.1
     overall = 0.0709                                         max =         15

                                                F(1,23822)        =    2712.80
corr(u_i, Xb)  = 0.0300                         Prob > F          =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |          0  (omitted)
      other  |          0  (omitted)
             |
        year |   .0182116   .0003497    52.08   0.000     .0175262    .0188969
       _cons |   .2551579   .0273177     9.34   0.000     .2016134    .3087023
-------------+----------------------------------------------------------------
     sigma_u |  .40800642
     sigma_e |  .30347936
         rho |  .64380983   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4710, 23822) = 8.91                 Prob > F = 0.0000


. xtreg ln_wage i.race year , re

Random-effects GLS regression                   Number of obs     =     28,534
Group variable: idcode                          Number of groups  =      4,711

R-sq:                                           Obs per group:
     within  = 0.1022                                         min =          1
     between = 0.0979                                         avg =        6.1
     overall = 0.0892                                         max =         15

                                                Wald chi2(3)      =    3204.55
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        race |
      black  |  -.1276321    .012944    -9.86   0.000     -.153002   -.1022623
      other  |   .0914599   .0540106     1.69   0.090    -.0143989    .1973187
             |
        year |   .0186316   .0003348    55.65   0.000     .0179754    .0192878
       _cons |   .2392558   .0269937     8.86   0.000      .186349    .2921626
-------------+----------------------------------------------------------------
     sigma_u |  .36717019
     sigma_e |  .30347936
         rho |  .59411999   (fraction of variance due to u_i)
------------------------------------------------------------------------------

As a closing-out remark, in order to increase your chances of getting helpful replies, for the future please post what you typed and what Stata gave you back (as per FAQ).

Kind regards,
Carlo
(Stata 19.0)

Comment

Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#3

27 Mar 2017, 03:09

thanks Carlo. If im not mistaken, the -fvvarlist- is for creating a categorical variable and i have done that manually in data editor which i gave each country code such 111,222,333,444,555. Specifically, I want to see between those 5 countries, which one has highest influence of capital flight to GDP, as i read this thesis, author tested capital flight for each region, and found one region is highest significant influence (page 26). so for my case, i want to define which country, is there has any ways to do this one?

https://www.google.com/url?sa=t&rct=...sEhK8XRpN216UA
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#4

27 Mar 2017, 03:49

Din:
you're right about -fvvarlist- capabilities. I would strongly recommend to avoid creating categorical variables and interaction by hand, because that way you cannot benefit from the wonderful capabilities of -margins- and -marginsplot- (which are both populated by -fvvarlist-, instead).
Unfortunately I can only gues the meaning of capital flight (somehng that in Italian is known as capital runaway). However, you should have -capital flight- as one of your predictor, to interact it with -countries-, again via -fvvarlist-.
So your code may look as follows:

Code:

xtreg GDP_growth i.countries##c.capital_flight <other_controls>, fe

As far as the regression outcomes reported at page 26 of the paper you quoted, I'm under the impression that author performed a set of single-country regressions.

Kind regards,
Carlo
(Stata 19.0)
Comment
Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#5

30 Mar 2017, 00:39

thanks once again Carlo:
Sorry for the time being taken to understand ur suggestion and i have refer to few similar questions also.
i take ur suggestion to include Capial Flight variable in this model.. but i used it as dependent var, and gdp inf, externdal debt, reer, as my explanatory, thus make me regress all from beginning (descriptive, pols, fixed, random, hausman, xtcsd pesaran abs, xttest3 and last xtreg CF i.country gdp ....., fe).
my results show that, 1) fixed effect model should be used from Hausman test...2) xtcsd, pesaran abs result p-value = 1.1210 (no serial correlation right?) ... 3) xttest3 p-value = 0.00000 (has heteroscedastic right?) .. 4) xtreg CF i.country inf....., fe results as like below .. countries omitted due to collinearity .. im a little bit confused with the result of no serial correlation but heteroscedasticity yes. IDK eitheir my model is wrong or what.. and so about the attached picture
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#6

30 Mar 2017, 01:05

Din:
you probably typed:

Code:

xtset country timevar

hence, -country- which is a time-invariant predictor, has no coefficient estimated under -fe-.

As far as heteroskedasticity is concerned, you can robustify or cluster (they do the same job under -xtreg-) standard errors: they deal with both heteroskedasticity and/or autocorrelation.

Kind regards,
Carlo
(Stata 19.0)
Comment
Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#7

30 Mar 2017, 03:24

actually i have inserted all data for my variables by copy and paste from excel (including categorize the country 111,222,333,,,).. and to set the data to be "panel data", i run through statistic>longitudinal/panel data>set up utilities>declare data to be panel, not by typing the command "xtset".

as i typed command . xtreg cf gdp inf xd reer ird nfdi, fe vce(robust), the results show no F and probability value (empty)
and for this command (xtreg CF i.countries##c.gdp <other_controls>, fe) , how do i type the command correctly sir?? what is "i.country##c" as for my case above? as far as i did the command [like this exactly i did . xtreg cf i.country gdp inf xd reer ird nfdi, fe vce(roust)], results only display for 222,333,444,555 and why 111 is not being displayed? sorry for such questions
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#8

30 Mar 2017, 03:36

Din:
most of your queries are covered in any decent panel data econometrics textbook.
That said:
- for misissing F-test with non-default standard errors, see http://www.stata.com/statalist/archi...msg00646.html
- for -i.countries##c.capital_flight- see -help fvvarlist-.;
- for reference category in categorical variables see, again, -fvvarlist- and, more comprehensively, https://en.wikipedia.org/wiki/Dummy_...le_(statistics)

Last edited by sladmin; 05 May 2017, 11:25.

Kind regards,
Carlo
(Stata 19.0)
Comment
Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#9

12 Apr 2017, 04:00

Hi Carlo,
Actually I really not understand about this matter in other discussions, so i decided to change my variables, as follow:
Country: Indonesia, Malaysia, Philippines, Thailand, Vietnam
Year: 2005-2015
Dependant Var: GDP, Independent Var: CF, NFDI, INF, IR, ED, REER

For my new case, there are few tests i need you all confirmation and guidance.
1) i run the test "egen countrynum = group (A)" as to grouping the country and give them code for each country.
2) i set the variables using command "xtset" for each variables individually, found all variables BALANCED except for REER
3) i run the descriptive statistics by using "xtsum countrynum B gdp cf nfdi inf ir ed reer" . this part still ok i think.
4) i run the POLS by using Statistics>Liner models and related>Linear regression. The p-value= 0.0506, and two variables significant at 5% which are CF and IR
5) then i run FEM by command " xtreg gdp cf nfdi inf ir ed reer, fe" p-value=0.1945 and only IR significant, then store "estimates store fe"
6) then i run REM by command "xtreg gdp cf nfdi inf ir ed reer, re" p-value= 0.0329 and only CF and IR are significant, then store "estimates store re"
7) then i run Hausman test, which favor REM, due to Prob>chi2=0.3223, i reject the alternative and choose null, which REM
8) then i regress again REM before proceed with autocorrelation test
9) to tes autocorrelation, i command "xtcsd, pesaran abs", p-value= 0.0011, reject null, means model has autocorrelation
10) test heteroscedaasticity using "xttest3" is invalid due to my model favor REM rather than FEM.
11) so i decided to use this command " xtgls gdp- reer" and result show homoscedasticity and no autocorrelation as i highlighted below.(is it correct???)

. xtgls gdp- reer

Cross-sectional time-series FGLS regression

Coefficients: generalized least squares
Panels: homoskedastic
Correlation: no autocorrelation

Estimated covariances = 1 Number of obs = 55
Estimated autocorrelations = 0 Number of groups = 5
Estimated coefficients = 7 Time periods = 11
Wald chi2(6) = 15.73
Log likelihood = -112.9977 Prob > chi2 = 0.0153

------------------------------------------------------------------------------
gdp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
cf | .0525629 .0210875 2.49 0.013 .0112322 .0938937
nfdi | .1737389 .1239499 1.40 0.161 -.0691985 .4166762
inf | -.1626513 .1093671 -1.49 0.137 -.377007 .0517043
ir | -.2794507 .1053193 -2.65 0.008 -.4858728 -.0730286
ed | -.0307354 .0261252 -1.18 0.239 -.0819399 .020469
reer | .0400895 .0316042 1.27 0.205 -.0218535 .1020325
_cons | 3.430404 3.450507 0.99 0.320 -3.332465 10.19327
------------------------------------------------------------------------------

12) then i command this ". xtreg gdp i.countrynum##c.cf nfdi inf ir ed reer, re", result as below which i dont know how to explain this test:

. xtreg gdp i.countrynum##c.cf nfdi inf ir ed reer, re

Random-effects GLS regression Number of obs = 55
Group variable: countrynum Number of groups = 5

R-sq: within = 0.1829 Obs per group: min = 11
between = 1.0000 avg = 11.0
overall = 0.3451 max = 11

Wald chi2(14) = 21.08
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0996

---------------------------------------------------------------------------------
gdp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
----------------+----------------------------------------------------------------
countrynum |
2 | 1.726526 4.999864 0.35 0.730 -8.073028 11.52608
3 | .3043713 2.427603 0.13 0.900 -4.453643 5.062386
4 | -.968474 3.347313 -0.29 0.772 -7.529088 5.59214
5 | .4947344 3.545636 0.14 0.889 -6.454585 7.444054
|
cf | .0792054 .1562874 0.51 0.612 -.2271122 .385523
|
countrynum#c.cf |
2 | -.0621073 .1305281 -0.48 0.634 -.3179376 .193723
3 | -.0315412 .1209067 -0.26 0.794 -.2685139 .2054316
4 | -.0144504 .1536045 -0.09 0.925 -.3155097 .2866089
5 | -.0493981 .1509178 -0.33 0.743 -.3451916 .2463954
|
nfdi | .2943652 .2358259 1.25 0.212 -.1678452 .7565755
inf | -.210873 .1270259 -1.66 0.097 -.4598392 .0380933
ir | -.2408274 .1218172 -1.98 0.048 -.4795848 -.00207
ed | -.0689681 .1091757 -0.63 0.528 -.2829486 .1450124
reer | .038697 .0459514 0.84 0.400 -.051366 .1287601
_cons | 4.631882 4.858378 0.95 0.340 -4.890363 14.15413
----------------+----------------------------------------------------------------
sigma_u | 0
sigma_e | 2.0317125
rho | 0 (fraction of variance due to u_i)
---------------------------------------------------------------------------------

Actually im not sure whether my model is correct or not, as i test between POLS and REM using command xttest0, results shown below:
. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects

gdp[countrynum,t] = Xb + u[countrynum] + e[countrynum,t]

Estimated results:
| Var sd = sqrt(Var)
---------+-----------------------------
gdp | 4.669162 2.160824
e | 3.804751 1.950577
u | 0 0

Test: Var(u) = 0
chibar2(01) = 0.00
Prob > chibar2 = 1.0000

Is there anyone really could help me in this matter? i really have no idea.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#10

12 Apr 2017, 04:24

Din:
as per the outcomes that you provided, it seems that POLS can do a better job with your data than -xtreg, re-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#11

12 Apr 2017, 07:00

ok thanks for the suggestion.

As far as i know, correction of autocorrelation and heteroscedasticity can be done under REM and FEM only. in my case, as the results suggest me to follow POLS, is there any ways for me to tackle autocorrelation and heteroscedasticity problem?

Plus, I have 5 countries and 11 years, meaning that N<T, so i should use XTGLS rather than XTREG. (correct me if I goes wrong)
Comment
Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#12

12 Apr 2017, 07:34

http://www.statalist.org/forums/foru...dom-panel-data here said on thread #6, should cluster panelid..
So in my case, i should cluster countrynum? how excatly to do the test?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#13

12 Apr 2017, 09:53

Din:
you're dealing with a small N, large T panel dataset.
I would take a look at -help xtpcse-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Izzuddin Abdulllah

Join Date: Mar 2017

Posts: 18
#14

13 Apr 2017, 02:00

thanks Carlo:

AUTOCORRELATION
I think, what i said in #9 9) is quite confusing. firstly i test for correlation using command "xtcsd, pesaran abs", result offer REJECT null hypothesis of no serial correlation (p-value = 0.0011). But, as I command "xtserial depvar indepvars" result offer ACCEPT accept null hypothesis of no first order autocorrelation, (p-value = 0.6375 > 5%).. Why this two tests offer different results? is it because of my data set?

HETEROSCEDASTICITY
as far as i concern, command such "xtpcse" works to correct first order autocorrelation as the extension to the command might be -correlation(ar1)- .. so, if in the AUTOCORRELATION part my result by using "xtserial" offers no first order autocorrelation, am i right not to worry to command for "xtpcse" (correct correlation) instead, I use command "xtgls" because it give me result of homoscedasticity (as i mentioned in #9 11) highlihgted in red))

Please assists me.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#15

13 Apr 2017, 11:16

Din:
the justification for using -xtgls- instead of -xtpcse- rests on its greater asymptotic efficiency, provided that the model is correctly specified.

I would take a look at the literature in your research field and see what Others did in the past when presented with the same reserch topic.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment