Interpreting i.Country and ##

Armand Ndraxi

Join Date: Jun 2017

Posts: 18
#1

Interpreting i.Country and ##

22 Jun 2017, 17:49

Hello,
I have a panel dataset that consists of countries and firms for each county. I want to test the effect of Firm Growth, Asset Tangibility, Profitability and Liquidity on Leverage ratio.

My command is as follows.
xtreg LEVERAGE c.l1.LEVERAGE##i.GDP_tv c.GROWTH##i.GDP_tv c.LIQUIDITY##i.GDP_tv c.PROFITABILITY##i.GDP_tv c.ASSETTANGIBILITY##i.GDP_tv i.Country_new, re cluster(Firm_new)

However, I am unable to properly interpret Countries' significance (How can you interpret that one country has a significant value and another country does not have a significant value?)

Could you help me interpret the i.Country values, along with the ## values?

Thank you.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#2

22 Jun 2017, 18:12

However, I am unable to properly interpret Countries' significance (How can you interpret that one country has a significant value and another country does not have a significant value?)

Basically, you don't. The coefficients for the countries represent time-invariant country-specific effects on leverage, such as might results from the effects of different laws in different countries. Unless one of your research goals is to specifically contrast the leverage levels in one country with another country, this part of the output should be ignored. It just allows you to adjust for time-invariant differences among the countries.

I should also add that, in any context, not just this one, the difference between statistically significant and not statistically significant is NOT itself statistically significant, nor even meaningful, and should never be anything you waste a moment's thought on.

All of your interaction terms have a continuous variable interacted with some discrete variable GDP_tv. So what you are saying in this model is that there is no single effect of any of these continuous variables on leverage. Rather there are separate effects corresponding to each of the levels of the variable GDP_tv. So probably the statistics of most interest are the marginal effects of these continuous variables at each level of GDP_tv. While it is possible to calculate those from the regression output, the process is tedious and error-prone. Fortunately, you used factor-variable notation in your regression, so Stata will do it for you effortlessly with the -margins- command, which you should run just after the regression itself.

Code:

margins GDP_tv, dydx(L1.LEVERAGE GROWTH LIQUIDITY PROFITABILITY ASSETTANGIBILITY)

You might also be interested in the expected values of LEVERAGE at each level of GDP_tv at selected interesting values of these continuous variables. So, for example, if you want to know the expected value of LEVERAGE at each value of GDP_tv when LIQUIDITY is 0, 2, 5, or 10 (I made up these numbers--use numbers that are reasonable and interesting in your actual data) that would be:

Code:

margins GDP_tv, at(LIQUIDITY = (0 2 5 10)) marginsplot

The -marginsplot- command in the above will graph those for you, which is often the best way to understand what is going on in interaction models.

Anyway, the really interesting useful results from these models are found in the output of -margins- and graphs created by -marginsplot-. The regression output itself constitutes the "raw ingredients" from which Stata calculates the expected values and marginal effects, but they are generally only of secondary interest in their own right. The one context in which the regression output itself takes on extra importance is if one of your research goals is to determine whether, and by how much, the effects of one or more of these continuous variables actually differs according to the value of GDP_tv. In that case, the regression output, and specifically the rows for the interaction terms themselves, answer those questions.

Last edited by Clyde Schechter; 22 Jun 2017, 18:15.
Comment
Armand Ndraxi

Join Date: Jun 2017

Posts: 18
#3

23 Jun 2017, 02:55

Hello Clyde,

Your answer was beyond my expectations and I thank you for that.

Now, my case is to distinguish between developing and developed countries, and the variable GDP_tv is a variable I have created to use GDP of countries to group them in Developed and Developing.

Nonetheless, I wanted to also add the effect of the continuous variables (without the interaction term) on Leverage as well. However, when I include those in the regression, STATA omits them, and I cannot figure out why.

The command I write in that case is:
xtreg LEVERAGE l1.LEVERAGE c.l1.LEVERAGE##i.GDP_tv GROWTH c.GROWTH##i.GDP_tv LIQUIDITY c.LIQUIDITY##i.GDP_tv PROFITABILITY c.PROFITABILITY##i.GDP_tv ASSET TANGIBILITY c.ASSETTANGIBILITY##i.GDP_tv i.Country_new, re cluster(Firm_new)

My main focus is to see whether these variables have an effect on LEVERAGE and whether this effect is different among the Developing and Developed countries. I, however, seem to be missing something and cannot properly assess my results and the omitted variables without the interaction term.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#4

23 Jun 2017, 08:22

Nonetheless, I wanted to also add the effect of the continuous variables (without the interaction term) on Leverage as well. However, when I include those in the regression, STATA omits them, and I cannot figure out why.

The command I write in that case is:
xtreg LEVERAGE l1.LEVERAGE c.l1.LEVERAGE##i.GDP_tv GROWTH c.GROWTH##i.GDP_tv LIQUIDITY c.LIQUIDITY##i.GDP_tv PROFITABILITY c.PROFITABILITY##i.GDP_tv ASSET TANGIBILITY c.ASSETTANGIBILITY##i.GDP_tv i.Country_new, re cluster(Firm_new)

When you have A##B in a model, Stata automatically includes A and B as well. So when you have both LIQUIDITY and c.LIQUIDITY##i.GDP_tv, you are including LIQUIDITY twice. But any variable is always colinear with itself, so one of the two copies gets omitted by Stata. In this case Stata chose to omit the copies that you specified rather than the ones that it created for you. If you look at your output closely, I'm sure you will see that LIQUIDITY is listed twice. One time it appears with a coefficient, standard error, etc. The other time it says (omitted). But it's there once, and that's all you need, and all you can get.

The main focus being on these variables' having different effects between developed and undeveloped countries, then your main focus when reading the output should be on the coefficients of the interaction terms. So, assuming that your GDP_tv variable is coded 0 (developing), 1(developed), the difference in the effect of liquidity on leverage between developed and developing countries is estimated by the coefficient of LIQUIDITY#1.developed. It has a standard error, a confidence interval, and a p-value as well. These are the rows of the regression output that directly answer your focal question.

As an ancilllary matter, it is usually enlightening to look at the actual effects of LIQUIDITY in each group of countries. That you get from margins:

Code:

margins GDP_tv, dydx(LIQUIDITY)

So just do this for each of LIQUIDITY, PROFITABILITY, and ASSET TANGIBILITY. (and lagged leverage if that's part of your focal question, too) and you have what you need.
Comment
Armand Ndraxi

Join Date: Jun 2017

Posts: 18
#5

23 Jun 2017, 18:13

Dear Clyde,

Thankful, once again for your contributing answer.

I tried performing every single -margin- command in STATA, but it gives me this message:
"default prediction is a function of possibly stochastic quantities other than e(b)"

I have a Random Effects model and I tried to perform the test with other commands as well, but none of them seems to be working.

Could you please mention any other way for me to perform the -margins- command, or anything similar, to retrieve the information I need regarding the specific effect of each variable, depending on the level of FDI.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#6

23 Jun 2017, 18:30

Yes, -margins- is somewhat restricted after -xtreg-. So used -mixed- instead. It estimates the same model, though it does so a bit differently and your results may differ slightly.

Code:

mixed LEVERAGE l1.LEVERAGE c.l1.LEVERAGE##i.GDP_tv GROWTH c.GROWTH##i.GDP_tv LIQUIDITY c.LIQUIDITY##i.GDP_tv PROFITABILITY c.PROFITABILITY##i.GDP_tv ASSET TANGIBILITY c.ASSETTANGIBILITY##i.GDP_tv i.Country_new || panel_var: , vce(cluster Firm_new) margins GDP_tv, dydx(LIQUIDITY) at(LIQUIDITY = (interesting values of liquidity))

In the above, replace the red italicized "panel_var" by whatever is the name of the panel variable you used when you -xtset- your data. That will give you an equivalent model. And, of course, in the -margins- command also substitute actual values for the liquidity variable where it says interesting values of liquidity.

Note: This approach will run in Stata version 15 and version 14.2. I seem to recall that in earlier versions of Stata, -margins- would give this same error message after -mixed- as after -xtreg, re-, but I'm not certain. Anyway, give it a try. But if you're running version 13 or earlier and it gives you the same error, then I don't know what else to advise you.
Comment
Sisi Ivanova

Join Date: Jun 2017

Posts: 10
#7

19 Jul 2017, 14:55

Hello all,

I am extremely new user of Stata and I am struggling with quite similar problem.

I have panel data for a number of industries in 9 different countries for the time period of 1995-2014.

What I am interested in?
1. The effect of ICT capital on employment for the whole period 1.1) across industries in general 1.2) across industries in each country
2. The effect of ICT capital on employment through the years 2.1) across industries in general 2.2) across industries in each country

I tried with point 1 first because I assumed that it should be easier but I am stuck. In general, I am not sure if the method I am using is correct. And there is definitely something wrong somewhere as at the end I get the same estimates per each industry.

I would really appreciate your help.

That's what I have figured out so far..

Code:

encode ID , generate (nID) encode Industry , generate (industry) drop Industy encode Country, generate (country) drop Country xtset nID year xtreg lnEMPE lnLAB lnICT lnK_NON_ICT lnVA i.industry##i.country margins, dydx(lnICT) margins, dydx(lnICT) over(industry)
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#8

19 Jul 2017, 15:05

Well, as an aside, the last command should be -margins industry, dydx(lnICT)-; the -over()- option does something different and it does not adjust for differences among industries on other variables. So probably you don't want the -over()- option.

That said, you the reason you are getting the same value for the marginal effect of lnICT in each industry is because in your -xtreg- command you don't have any interaction between lnICT and industry. So I think you really want your model to be:

Code:

xtreg lnEMPE lnLAB lnK_NON_ICT lnVA lnICT##i.industry##i.country

That is a model that specifies variation in the effect of lnICT by industry (and by country and by industry-country pair).
Comment
Sisi Ivanova

Join Date: Jun 2017

Posts: 10
#9

19 Jul 2017, 15:30

Thank you for the immediate reply!

I tried the new code, but I get an error message. I would assume that's because there are some industries, which have no ICT data. Can this be the reason? Is there a smart way I can handle with this?

Code:

xtreg lnEMPE lnLAB lnK_NON_ICT lnVA lnICT##i.industry##i.country lnICT: factor variables may not contain noninteger values r(452);
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#10

19 Jul 2017, 16:36

No, that's not the problem. I forgot that lnICT is a continuous variable. So it must be:

Code:

xtreg lnEMPE lnLAB lnK_NON_ICT lnVA c.lnICT##i.industry##i.country

I'm sorry for the error.

The industries which have no ICT data will simply be omitted from the estimation sample, but they will not provoke any error messages.
1 like
Comment
Sisi Ivanova

Join Date: Jun 2017

Posts: 10
#11

21 Jul 2017, 05:33

Thank you very much, Clyde.

May I ask something else which is on a bit different topic? I was just checking some of the previous discussions on this topic and I read a lot of comments from your side, so I would assume that you would know the answer of my question.

I am trying to figure out the long-term effect of ICT capital on employment across industries.

I was advised to do it by building a regression on the difference in the variables (both the dependent and independent ones) between the observations' latest year (2014) and the first year (1995). I thought that the long-term effects are normally obtained with lags, but it seems that I cannot use them in this case.

Based on my simple understanding of Stata, I though I can just define the delta variables (as for example 2014 ICT capital - 1995 ICT capital for industy A in country AA, meaning per ID) and then build a regression using the delta variables. I know it shouldn't be that difficult, but I am really struggling with the code for the delta variables..
Comment
Sisi Ivanova

Join Date: Jun 2017

Posts: 10
#12

21 Jul 2017, 07:03

Ah, I think I just found the solution.

Code:

gen diff_ICT = s19.ICT
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30187
#13

21 Jul 2017, 09:04

Originally posted by Sisi Ivanova View Post

Ah, I think I just found the solution.

Code:

gen diff_ICT = s19.ICT

Well s19.ICT will give you the difference between the 2014 value and the 1995 value of ICT. But for your 2013 observation, it will try to calculate the difference between 2013 and 1994. And , if I understood you correctly in #11, your data begins at 1995. That implies that all observations before 2014 will have missing values for diff_ICT. Is that what you want?

If what you need is, for every year, the difference between that year and the value in 1995, it would require creating a new variable:

Code:

by nID, sort: egen ICT1995 = max(cond(year == 1995, ICT, .)) gen diff_ICT = ICT - ICT1995

If what you need is, for every year, the difference between that year and the value in the first year of your data (which may or may not be 1995 depending on different values of nID):

Code:

by nID (year), sort: gen diff_ICT = ICT - ICT[1]

Finally, with regard to #11, I cannot advise you. How to best represent these effects is a substantive question in economics/finance and is far outside my domain. (I'm an epidemiologist.)
Comment

Announcement

Interpreting i.Country and ##

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment