Diff-in-Diff three-way interaction term with fixed effects

Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#1

Diff-in-Diff three-way interaction term with fixed effects

13 Apr 2018, 13:59

Dear all,

I am conducting a DiD test with fixed firm and year-month fixed effects. My regression looks like this:

Code:

areg DepVar i.treated##i.during Controls i.month, absorb(permno) vce(cluster permno)

I am interested in further examining the effect of the treatment by including a third interaction term. For instance, I would like to see if the treatment had a bigger effect for firms with high/low monthly turnover. I have thought about creating dummy variables in each month for firms with below/above median turnover (TreatedxDuringxHighturnover). I was told that it also works to put a continuous variable in the interaction term (TreatedxDuringxTurnover). Which method do you think works best?

I need some help with constructing the regression equation with fixed effects, as well as with the interpretation of the interaction terms.

Furthermore, correct me if I am wrong, if I wanted to extend the experiment to three periods the regression should look something like:

Code:

areg DepVar i.treated##i.during i.treated##i.post Controls i.month, absorb(permno) vce(cluster permno)

Treated*during would be the change in the treatment group relative to the control group in periods pre-during, and treated*post would be the diff-in-diff estimator for the pre-post periods. Is this correct?

Any advice would be greatly appreciated!
Thanks!
Tags: None
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#2

14 Apr 2018, 09:59

Anyone has solution to this issue?

I tried dropping the observations with HighTurnover=0, but I think that's not the correct thing to do in this type of setting, because it biases the DiD estimate. I think it only works with the three-way interaction term. However, which other interactions should I include in the regression if I also account for fixed effects? Furthermore, how do I interpret the interaction terms?
Comment
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#3

18 Apr 2018, 14:11

Any comments?
Thanks!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#4

18 Apr 2018, 15:19

I am interested in further examining the effect of the treatment by including a third interaction term. For instance, I would like to see if the treatment had a bigger effect for firms with high/low monthly turnover. I have thought about creating dummy variables in each month for firms with below/above median turnover (TreatedxDuringxHighturnover). I was told that it also works to put a continuous variable in the interaction term (TreatedxDuringxTurnover). Which method do you think works best?

Making a categorical variable out of a continuous variable is almost always a bad idea. Let's say, for the sake of discussion, that turnover is continuous and ranges from 0 to 100. Say the median that you propose as the cutoff is 35. By using a dichotomy, you are saying that an entity with turnover = 36 is radically different from one with turnover = 34, but is indistinguishable from one with turnover = 100. Such conclusions are almost always nonsense. That said, there are very rare circumstances in the real world where something truly discontinuous happens as one passes a cutoff. I don't work in your field, so I can't tell you if this is one of those rare circumstances. All I can say is that they are very rare, and that it would be particularly surprising if the cutoff that corresponds to this real-world discontinuity turned out to be the median!

The downside of using a continuous variable, however, is that the relationship you are trying to model might not, in reality, be linear, so that some kind of transformations might be necessary. Graphical exploration will usually be helpful in figuring out what transformations, if any, might be needed to linearize the relationship. This is still preferrable to false dichotomies.

areg DepVar i.treated##i.during i.treated##i.post Controls i.month, absorb(permno) vce(cluster permno)

No, that's wrong. What you need to do is create a three level variable. Let's call it era and code it 0 for before the treatment, 1 for during, and 2 for after. Then the code should be:

Code:

areg DepVar i.treated##i.era Controls i.month, absorb(permno) vce(cluster permno)

This code will give the same -areg- results as what you proposed. But the reason I say your code is wrong is that you are likely to end up turning to the -margins- command to interpret your results. And -margins- will get it wrong with your code because -margins- will not know that post and during are part of the same three-level variable.

I suggest you read up on the -margins- command. It is nearly impossible to interpret models with three-way interactions without it, and it is really handy even with just two-way interactions. The best introduction to -margins- I know of is the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf. At the same site he has additional presentations on Margins, and there is also the -margins- chapter of the PDF documentation that comes with your Stata installation. After doing that, try to write the -margins- command(s) for interpreting your results. If you're not sure you're getting it right, you can always post back showing what you've done, along with the output, for help or confirmation.
Comment
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#5

24 Apr 2018, 07:02

Thank you, Clyde!

I understand the potential limitations of both methods. But how would the regression equation look if I add a third term to the interaction: lets say HighTurnover - dummy equals 1 if turnover for each stock in a month is higher than the median turnover for all stocks in that particular month? I am guessing something like this:

Code:

bysort month: egen median=median(turnover) gen HighTurnover=0 replace HighTurnover=1 if turnover>median

Code:

clogit DepVar i.treated##i.during##i.HighTurnover Controls i.month, group(permno) vce(cluster permno)

(I do a conditional logit analysis with fixed effects where the DepVar is a dummy variable)

In this case I get 7 terms in total: treated 1 ; during 1 ; HighTurnover 1 ; treated#during 11; treated#HighTurnover 11; during#HighTurnover 11 ; during#treated#HighTurnover 111. Again, during and treated should be collinear with the fixed effects and thus omitted. But would you try to explain the interpretation of these coefficients?

Furthermore, I am a bit confused, because in this case (when I do the three-way interaction) I get significance of treated#during, but when I do the normal DiD ( with just i.treated#i.during) the same coefficient is statistically insignificant.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#6

24 Apr 2018, 10:07

In this case I get 7 terms in total: treated 1 ; during 1 ; HighTurnover 1 ; treated#during 11; treated#HighTurnover 11; during#HighTurnover 11 ; during#treated#HighTurnover 111. Again, during and treated should be collinear with the fixed effects and thus omitted. But would you try to explain the interpretation of these coefficients?

I might if you actually showed the results. Then again, interpreting three way interactions from regression coefficients is a really tedious process and involves a lot of error-prone algebra. I'd be more inclined to do this with the output of suitable -margins- commands such as -margins during#HighTurnover, dydx(treated)-. (If you got "not estimable" results, add the -noestimcheck- option.)

Furthermore, I am a bit confused, because in this case (when I do the three-way interaction) I get significance of treated#during, but when I do the normal DiD ( with just i.treated#i.during) the same coefficient is statistically insignificant.

The meaning of the treated#during term in the i.treated##i.during model is different from the meaning of the treated#during iterm in the i.treated#i.during#i.highturnover model. There is no reason to expect them to be the same, or even similar, or even of the same sign. Moreover, if you are thinking about these interactions models in terms of what is "statistically significant" I think you will never escape the world of confusion. Just think about magnitudes of effects, and their confidence intervals for a sense of the precision with which those effects are estimated. But if you use some arbitrary p-value cutoff (even the historically "venerated" 0.05) to declare that effects "exist" or "don't exist" or "are significant" or "are not significant" a world of apparent paradoxes await you and you can spend the rest of your life bewildered by them. My best advice in these models is to not even look at the p-values.
Comment
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#7

26 Apr 2018, 07:50

Dear Clyde,

Let's leave aside the three-way interaction term for now.

I have a question on using a logit regression in a DiD setting. Here (https://stats.stackexchange.com/ques...google_rich_qa) it states that using DiD estimators in a non-linear setting (logit) regression results in the common trend assumption be violated.

Furthermore, the coefficient for treatedxduring requires a non-linear transformation in order to make any sense. Do you suggest I stick to the logit model, or to switch to a linear probability model with a categorical independent variable?

Code:

clogit DepVar treated##during Controls i.month, group(permno) vce(cluster permno) (1)

Code:

areg DepVar treated##during Controls i.month, absorb(permno) vce(cluster permno) (2)

Also, I get the coefficient of treatedxduring using (2) equal to -0.014. It tells the expected average change over time in the outcome variable for the treatment group, compared to the average change over time for the control group. But how do I interpret this coefficient in terms of economic magnitude (percentage terms)? The mean for the whole sample in both periods of Y is 0.30. Therefore, the percentage change should be equal to-0.014/0.3 = -4.7%? Is it correct to scale it by the mean for the whole sample ( both treated and control groups) in both periods? Furthermore, 4.7% seems like a considerable decrease. However, the coefficient of treatedxduring is not statistically significant. What could be the explanation?

Thank you very much for you time and help!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#8

26 Apr 2018, 09:35

Here (https://stats.stackexchange.com/ques...google_rich_qa) it states that using DiD estimators in a non-linear setting (logit) regression results in the common trend assumption be violated.

I haven't looked at that link, so I am not directly responding to what it says there. What is true is that the common trend assumption, strictly understood, cannot be true in both the linear and logit models. But there is nothing that prevents it from being true in the logit model setting (and, therefore false in the linear model setting). That said, in the real world, the common trend assumption is rarely strictly true in either setting! If you plan to formally test the common trend assumption (for either model) your result will probably depend more on your sample size and the corresponding power to reject the null hypothesis than on reality. The common trend assumption, strictly speaking, is almost always false in the real world, but close enough to true that for practical purposes DID estimates can be useful. I think that graphical exploration of the common trend assumption is the most reasonable way forward. In statistical practice there are many situation where we rely on approximations that are no less crude than this.

Furthermore, the coefficient for treatedxduring requires a non-linear transformation in order to make any sense.

This is actually very controversial and has been discussed elsewhere in this Forum. While my own position is to prefer interpreting the results in the probability metric (i.e. inverse logit transformed from the coefficients), others have argued from different perspectives that the coefficient metric is in fact the more suitable one for causal inference. I have also argued elsewhere on this forum that while there is, in theory, a difference, in practice the two approaches usually result in very similar answers. I would even go so far as to say that if these two analyses led to opposing conclusions, then the data are insufficiently informative and the correct answer is "too close to call."

Do you suggest I stick to the logit model, or to switch to a linear probability model with a categorical independent variable?

Either model might be better than the other, depending on your situation. If there is nothing in the literature of your subject matter, and no quantitative theory to guide you, I would try both ways and see which one fits the data better. (I would not see which one gives you results supporting your preferred conclusion, or which one gives "significant" results.)

Also, I get the coefficient of treatedxduring using (2) equal to -0.014. It tells the expected average change over time in the outcome variable for the treatment group, compared to the average change over time for the control group. But how do I interpret this coefficient in terms of economic magnitude (percentage terms)? The mean for the whole sample in both periods of Y is 0.30. Therefore, the percentage change should be equal to-0.014/0.3 = -4.7%? Is it correct to scale it by the mean for the whole sample ( both treated and control groups) in both periods? Furthermore, 4.7% seems like a considerable decrease.

Those calculations are, at best, a "back of the envelope" approximation. You need to use the -margins- command to calculate these properly.. When you refer to "percentage terms," I imagine you mean that you want to calculate semi-elasticity. So look at the -eydx()- option of -margins.

However, the coefficient of treatedxduring is not statistically significant. What could be the explanation?

The explanation is the same as the explanation for any finding that is not statistically significant. It means that relative to the size of the effect you are trying to estimate, the data are too imprecise to determine whether it is positive, negative or zero. This may be due to noisy measures, insufficient sample size, or the actual effect being very small (or a combination of these). Be aware that even large samples can have limited power to detect interaction effects. See http://andrewgelman.com/2018/03/15/n...e-main-effect/ for a nice explanation and demonstration.
Comment
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#9

26 Apr 2018, 11:09

Dear Clyde,

Thank you very much for your answer!

As you can tell, I am not very proficient with Stata and I am struggling to understand the -margins- command. From my understanding I have to first perform a regression. Lets say:

Code:

areg DepVar treatedxduring Controls i.month, absorb(permno) vce(cluster permno)

Afterwards I type:

Code:

margins, eydx(treatedxduring)

What exactly does ey/dx tell in this case?

One last thing, I know I have asked the same question before, but I am still uncertain whether to use firm and month fixed effects in my DiD regression. As you wrote in another post a while back:

Originally posted by Clyde Schechter View Post

When you include a firm ID effect in the model, you eliminate any confounding that might be caused by effects (observed or unobserved) that are constant over time within each firm.

When you include a year effect, you eliminate any confounding that might be caused by effects (observed or unobserved) that are constant across all firms within each year.

If you include these both, you eliminate entirely both the treatment group effect (which is constant within firms over time) and the pre-post effect (which is constant across firms within years, at least in your design). Both the TREAT and POST variables will be dropped. So your model can no longer estimate the impact of the intervention when you do this: it is a ghost of a difference-in-differences model and will provide you with no information about the intervention's impact.

I tried several options:

1. Without fixed effects:

Code:

reg DepVar i.treated##i.during Controls cluster (permno) (1)

This gives the same results as the -diff- command in stata and is the standard DiD regression.

Code:

diff DepVar, t(treated) p(during) cov(Controls) cluster(permno) (2)

2. When I include firm and time fixed effects:

Code:

areg Depvar i.treated##i.during Controls i.month, absorb(permno) vce(cluster permno) (3)

Code:

areg Depvar treatedxduring Controls i.month, absorb(permno) vce(cluster permno) (4)

where treatedxduring is the interaction term

Regressions (3) and (4) yield the same results. Pilot is collinear with the firm fixed effects, and since the treatment happens at one point in time for all of the subjects in the group, during is collinear with the month fixed effects. In (3), however Stata shows the during coefficient at the expense of the last i.month, which is omitted. However, whether I use the factor notation or the interaction term, Stata yields the same results. I am referring to your quote above and asking which of the models is the correct one. If I include both fixed effects does the interaction term still hold the qualities of a correct DiD estimator?

Last edited by Fanetti Mazakura; 26 Apr 2018, 11:12.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#10

26 Apr 2018, 12:03

Code:
areg DepVar treatedxduring Controls i.month, absorb(permno) vce(cluster permno)
Afterwards I type:

Code:
margins, eydx(treatedxduring)
What exactly does ey/dx tell in this case?

I see you have reverted to using your treatedxduring variable. In this particular instance, because you have both absorbed permno and included i.month in the model, the results are meaningful and interpretable. But in the more general situation, they would not be. So I think you should really get out of the habit of doing this. You got it right this time, but you are likely to get it wrong most of the time. Use factor-variable notation and run this as:

Code:

areg DepVar i.treated##i.during Controls, absorb(permno month) vce(cluster permno)

and you will then get the same results you got this time (because, in this particular instance your original model happens to be correct). Then follow that with:

Code:

margins, eydx(treated) at(during = 1) noestimcheck

and again, this will give you the same results as the -margins- command you wrote.

(Ok, you don't have to actually re-run it, but I want to make the point that what you did worked out only because of restrictive circumstances that you will not encounter in general, and you should get in the habit of doing it the way that will always work. It is especially imporptant because it is, in general, completely invalid to try to calculate a marginal effect, or an elasticity or semi-elasticity of an interaction term. When you use -margins- and factor-variable notation, Stata will pick that up if you try to break that rule and will refuse to go along.)

So what does it mean? Let's say that the result you got from that -margins- command is 0.07, for the purposes of illustration. It means that a unit increase in treated when during = 1 (i.e. the actual onset of treatment in the treatment group) is associated with a 0.07 increase in log(DepVar). That is equivalent to saying that DepVar increases by a factor of exp(.07), which is approximately 1.07 (actual value to 4 places is 1.0725). So the onset of treatment in the treatment group is associated with an increase in DepVar of approximately 7%. (7.25% would be closer).

As for what I wrote elsewhere:

When you include a firm ID effect in the model, you eliminate any confounding that might be caused by effects (observed or unobserved) that are constant over time within each firm.

When you include a year effect, you eliminate any confounding that might be caused by effects (observed or unobserved) that are constant across all firms within each year.

If you include these both, you eliminate entirely both the treatment group effect (which is constant within firms over time) and the pre-post effect (which is constant across firms within years, at least in your design). Both the TREAT and POST variables will be dropped. So your model can no longer estimate the impact of the intervention when you do this: it is a ghost of a difference-in-differences model and will provide you with no information about the intervention's impact.

That is, as written wrong. I recall what I was thinking when I wrote it, but it is not correct. I had in mind a different context whereby it was contemplated to introduce fixed effects for combinations of firm and time--that would make the model completely uninformative. But just introducing firm effects and introducing time effects (but not for their combinations) does not produce this problem. So please disregard that earlier quote; it is wrong and I'm really sorry I wrote that.

Regressions (3) and (4) yield the same results. Pilot is collinear with the firm fixed effects, and since the treatment happens at one point in time for all of the subjects in the group, during is collinear with the month fixed effects. In (3), however Stata shows the during coefficient at the expense of the last i.month, which is omitted. However, whether I use the factor notation or the interaction term, Stata yields the same results. I am referring to your quote above and asking which of the models is the correct one. If I include both fixed effects does the interaction term still hold the qualities of a correct DiD estimator?

Yes, the two regressions give the same result. They are both correct. But (4) is a bad coding practice and it works correctly only when both firm and time effects are absorbed (or included through i.time). In any model that did not incorporate both the firm and time effects, (4) and (3) would be different and only (3) would be correct. So get in the habit of doing it with factor-variable notation. It will never lead you astray.
Comment
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#11

26 Apr 2018, 13:52

Dear Clyde,

Thank you very much for your valuable inputs! I am immensely grateful! The DiD model with fixed effects is very clear to me now. I hope it helps others struggling with it as well.

With respect to the -margins- command in order to find the percentage change I am not quite sure whether you are talking about the OLS or the logit model.
My results for the OLS model:

Code:

areg DepVar treatedxduring Controls i.month , absorb(permno) vce(cluster permno)

I get a coefficient of treatedxduring = -0.012 with a p value of 0.3 (insignificant). I understand that this is the absolute change in Y over Treated-Control @t=0 - Treated-Control @ t=1. The mean value of DepVar for both groups and both periods is 0.17.

Code:

margins, eydx(treatedxduring)

Gives ey/dx for treatedxduring = -0.15. This means that the DepVar in the treatment group decreased by approximately 15% relative to the control group over the period, no? This effect seems economically significant. The statistical insignificance could be due to the specifics of my dataset. It's unbalanced with large gaps between observations, or due to the fact that the treatment has no effect on that particular variable. Should I even bother explaining the economic magnitude if the p value is quite high?

My results for the clogit model:

Code:

clogit DepVar treatedxduring Controls i.month , group(permno) vce(cluster permno)

I get a coefficient of pilotxduring of -0.086. I understand that it has to undergo a non-linear transformation, but I am not sure what kind. I should probably stick to the OLS method, since it is easier to interpret.

Furthermore, I will accompany my regression results with a graphical illustration. However, the methodology for plotting the graph is much simpler than the DiD with fixed effects:

Code:

egen mean0 = mean(cond(treated == 0, sold, .)), by(month) egen mean1 = mean(cond(treated == 1, sold, .)), by(month) gen diff = mean0 - mean1 sort month line mean* diff month , legend(order(2 "treated" 1 "untreated" 3 "difference"))

I just take the mean for the treated group, the mean for the control group, and their difference and plot them. I don't know how accurate this approach is.

p.s.
I understand your advice with respect to the factor-variable notation. It is dully noted.

Last edited by Fanetti Mazakura; 26 Apr 2018, 13:55.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#12

26 Apr 2018, 14:31

So, let's take a dive into some of these numbers.

In the linear model, your coefficient is -0.012, which means that treatment is associated with a decrease of 0.012 in your outcome variable. You say that the mean value of the outcome variable itself is 0.017, so if you just divide you get a decrease of about 7% in the outcome variable associated with treatment. Notice that what you have calculated here is the ratio of the average decrease in outcome to the average outcome. But the outcomes vary across observations. Some are larger and some are smaller. And the average decrease in outcome divided by the average outcome is not the same thing as the average of (decrease divided by outcome) over the observations--and the latter is what -margins, eydx(...- is calculating. In particular, if some of your outcomes are near zero, those observations will have enormous values of average decrease/outcome, and they will skew the result rather high. If you have any observations with DepVar = 0, then it is not even valid to calculate -margins, eydx()- here.

The fact that you go on to do a logistic regression tells me that in fact your DepVar is probably dichotomous--if you have said that earlier in this thread, I have long lost sight of that fact. If DepVar is dichotomous, then you cannot calculate elasticities or semi-elasticities. Perhaps I led you down the wrong path by suggesting that--I thought it was what you meant and that DepVar was a continuous variable, but perhaps it isn't. You can, if you wish, calculate the percent of positive outcomes with and without treatment (pre and post) and you can express the treatment effect as the difference in differences, which is in units of percentage points.. But I think that trying to represent that difference as a percentage of the baseline is not a good idea; whatever statistic you come up with is likely to be misunderstood.

So using your -areg- model, if DepVar is a dichotomy, I would simply say that the estimated effect of treatment is to decrease the outcome probability by 1.2 percentage points. The -margins- output will also show a confidence interval around that estimate, and you should present that as well as a measure of uncertainty. The -margins, eydx()- results are meaningless if the outcome is dichotomous (and is of dubious value if DepVar is continuous but has values less than or equal to 0, or even close to 0.) As for the lack of statistical significance, the aspects of your data that you point out may well contribute to that, as can other things. I generally find the concept of statistical significance not useful in looking at these models. The null hypothesis of 0 effect is generally a completely implausible straw man, and I do not understand why people think it is useful to test its compatibility with the data. I think the focus should be on estimating the size of the effect, and being aware of the precision of our estimate. In this case we have an estimate that is pretty small, and not very precise. So the effect itself may be small, or the data are inadequate to the task, or, very possibly both.

In your -clogit- model, you are estimating a treatment effect as an odds ratio. With the coefficient being -0.086, the corresponding odds ratio is exp(-0.86) = 0.92 to two decimal places. If a positive outcome is relative rare, then the odds ratio is almost equal to the risk ratio, and one could speak of this as saying that the probability of an event is reduced by 8 percent of its original value. If the positive outcomes are not rare, then this approximation works poorly and you should not attempt this kind of interpretation. Also, with the -clogit- model you cannot estimate the actual probability of a positive outcome under either the treated or untreated condition, so interpretation gets even harder.

Your approach to graphing looks reasonable to me, and will probably be far more understandable than any of the numbers we are talking about here. Of course, what you are plotting is not coming from your model, it is tracking the actual data and is not adjusted for the covariates in your model.

Added note: if the difference is small (as the numbers we are discussing suggest it is) the diff line may be squashed on the bottom of the graph. So you might want to plot the difference line in a separate panel so that its extent of variation over time will be more visible.
Comment
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#13

27 Apr 2018, 04:23

Thank you very much, Clyde!

I will shortly explain my empirical design in order to clarify the confusion.

I am using a natural experiment to test the impact of it on the stock trading of a group of insiders in the firms. Since they don't trade every month, I have assigned a dummy variable trade_dummy to measure whether trading occurs if even a single insider trades the company stock in a given month. From my understanding I could go with either with logit or a linear regression in this case. If I do a linear regression, the interaction term would tell the absolute change in trade_dummy. If I go with logit I would have to transform the coefficient with 1-exp(beta) and it would tell the same thing as in the linear model, that is the treatment group experienced a 8% decline in the probability of conducting a trade relative to the control group.

Furthermore, since there could be a different number of insiders trading in that month ( from 0 to 20), the dummy variable doesn't distinguish if that number is > 0. One thing that comes into my mind is whether to use [fw=number of insiders] in my regressions. If I do so, the results change quite a lot. Any advice on that? (number of insiders is . if trade_dummy=0). I am not really sure how Stata calculates frequency weights in this case.

The second variable is the size of the trade. It is present only if trade happens. It is calculated as the sum of shares traded by all firm insiders in a given month. In this case the linear model works perfectly. If I would like to calculate the economic magnitude, the average change divided by outcome, I should use - margins, eydx - command. In that case, the absolute change is -0.012, and the eydx is -15%, which seems like a considerable change. Graphically, after the treatment the difference of treated - control has visibly more points below 0. But the results yield no significant p-values. What should be the argumentation here? "Even though the magnitude of the coefficient might seem economically relevant, the high p-values fail detect a statistical significance"
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#14

27 Apr 2018, 11:42

Now, I see that your linear and logistic models refer to two different outcome variables (or at least one of your linear models does). I am very suspicious about interpreting semi-elasticity (eydx) if any of the observations have outcome values close to zero, and it is really not meaningful to look at semi-elasticity at all if any of the observations have negative or zero outcome values. Since sum of shares treated sounds like it cannot be negative, we are partly safe, but it does sound like zero values are possible and close to zero values are even likely. So I would be inclined to disregard the eydx results as they are overly influenced by those near zero observations. The absolute change sounds more meaningful in this context. As for a result that seems substantively relevant but not statistically significant, it just means that your data are not up to the task of distinguishing an effect of that substantively relevant magnitude from noise in the data. Whether this is due to excessive noise in your measurements or an inadequate sample size I cannot say.

Your proposal to use fweights in your model strikes me as way off base. Frequency weights are used when a single observation in the data set actually represents a larger number of actually identical observations. Frequency weights are most commonly needed when individual level data has been aggregated into a smaller number of observations by removing duplicate observations and counting how many there were. This is not your situation here. What you have is a situation where there is a varying number of traders who are "at risk" to trade, but your outcome variable simply indicates whether any of them traded or not. This cannot be resolved, so far as I can see, with any weighting scheme. If I were going to try to improve that model, I would change it altogether. Instead of a single dichotomous variable indicating whether any insiders traded, I would have the outcome be the actual count of the number of insiders who traded. I would then use a generalized linear model with a binomial distribution family and a logit or probit link to model that. Since you have panel data and need to account for that structure, and there is no -xtglm-, I would go to -melogit- for this. I know that in finance random effects models are viewed skeptically, but I think that the risk of some endogeneity in the model is a small price to pay for having an outcome measure and error distribution family that bears a closer resemblance to the real world data generating process. Your current approach of a dichotomous outcome for whether any insider trades strikes me as discarding too much information.
Comment
Fanetti Mazakura

Join Date: Feb 2018

Posts: 48
#15

27 Apr 2018, 16:14

Thanks, Clyde, for your thorough answer.

I guess I haven't really thought much about the fweights. You are right.

Due to the nature of my data, that is, observations don't tend to happen regularly, I was thinking that maybe I should aggregate it on a broader timeframe - for instance quarterly. Do you think this is a good idea would impact my results in a better way?
Comment

Announcement

Diff-in-Diff three-way interaction term with fixed effects

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment