Need Help with Differences-in-Differences model.

Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#16

27 Mar 2017, 19:50

Also how would I go about doing the same modeling for the genexp, protfault, and disclos variables?

Just replace each appearance of apology in the regression and margins commands by genexp, etc. (one set of commands at a time).
Comment
Issy Ojalvo

Join Date: Mar 2017

Posts: 22
#17

28 Mar 2017, 11:34

The problem is that when I do the

Code:

collapse (count) n_lawsuits = payment2 (first) apology, by(workstat2 pre_post)

command it gets rid of my other data set for a new one that does not have a genexp, protfault, disclos corresponding variable to n_lawsuits. Also did the data for the the n_lawsuits x apology look correct? It doesn't give exact numbers of the before and after of lawsuits.

Thanks for all the help!! Almost done, after we figure this out I might have a few questions on the reasoning behind the code but that's about it.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#18

28 Mar 2017, 14:23

Well, like apology, genexp, protfault, and disclos are constant within workstat2, right? So then it's

Code:

collapse (count) n_lawsuits = payment2 (first) apology genexp protfault disclos, by(workstat2 pre_post)

and then you should be good to go.

The model predicted values will in general not be equal to the observed values. You are fitting a model that has certain assumptions about the form of the error distribution. That assumption will, in general, not match the reality of the data generating process. Ideally it is reasonably close, but it will seldom be exact. So when the model estimates the parameters (coefficients and errors) it finds the closest fit to the data that is possible within the constraints imposed by that assumption. But the fit will not be exact. Now, if the fit is really poor, then you have to think about using a different model. But you should not, in general, expect exact correspondence between predictive and observed.

Also did the data for the the n_lawsuits x apology look correct?

I don't know the content here, so I can't really judge whether these results look plausible or not. I don't know what plausible even means in this context, let alone correct. What I can say is that the commands look correct. But I have some advice to make the output easier to work with. I had forgotten that after -xtlogit, fe-, the default prediction for -margins- is the linear prediction, xb. That's not really very useful, because it's the logarithm of what you're really interested in. So I would redo the -xtpoisson,fe- runs, and add to the -margins- commands the option -predict(nu0)-. That way the things that -margins- predicts will be numbers of events rather than logarithms. The results will be more natural and easier to understand.
Comment

Issy Ojalvo

Join Date: Mar 2017
Posts: 22

#19

28 Mar 2017, 18:56

Thanks again, but I'm still not sure the n_lawsuits numbers are correct. Even with the -predict(nu0) the numbers look more like logs. It could also be that since the variable at play is one of them that is binary from 0-1, rather than n_lawsuits. Here is the data.

Code:

. xtset workstat2
       panel variable:  workstat2 (unbalanced)

. xtpoisson n_lawsuits i.apology##i.pre_post, fe vce(robust)

Iteration 0:   log pseudolikelihood =  -1548.224  
Iteration 1:   log pseudolikelihood = -319.24879  
Iteration 2:   log pseudolikelihood = -319.02494  
Iteration 3:   log pseudolikelihood = -319.02494  

Conditional fixed-effects Poisson regression    Number of obs     =         88
Group variable: workstat2                       Number of groups  =         44

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(2)      =     352.79
Log pseudolikelihood  = -319.02494              Prob > chi2       =     0.0000

                                  (Std. Err. adjusted for clustering on workstat2)
----------------------------------------------------------------------------------
                 |               Robust
      n_lawsuits |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
       1.apology |          0  (omitted)
      1.pre_post |  -.4459613   .0297531   -14.99   0.000    -.5042763   -.3876463
                 |
apology#pre_post |
            1 1  |   .0684329    .044695     1.53   0.126    -.0191677    .1560336
----------------------------------------------------------------------------------

. margins apology#pre_post, predict(nu0) noestimcheck

Adjusted predictions                            Number of obs     =         88
Model VCE    : Robust

Expression   : Predicted number of events (assuming u_i=0), predict(nu0)

----------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
apology#pre_post |
            0 0  |          1          .        .       .            .           .
            0 1  |   .6402085   .0190482    33.61   0.000     .6028748    .6775423
            1 0  |          1          .        .       .            .           .
            1 1  |   .6855537    .022865    29.98   0.000     .6407391    .7303684
----------------------------------------------------------------------------------

. 
. margins apology, dydx(pre_post) predict(nu0) noestimcheck

Conditional marginal effects                    Number of obs     =         88
Model VCE    : Robust

Expression   : Predicted number of events (assuming u_i=0), predict(nu0)
dy/dx w.r.t. : 1.pre_post

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.pre_post   |
     apology |
          0  |  -.3597915   .0190482   -18.89   0.000    -.3971252   -.3224577
          1  |  -.3144463    .022865   -13.75   0.000    -.3592609   -.2696316
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

. xtpoisson n_lawsuits i.genexp##i.pre_post, fe vce(robust)

Iteration 0:   log pseudolikelihood =  -1548.224  
Iteration 1:   log pseudolikelihood = -308.50506  
Iteration 2:   log pseudolikelihood = -308.26194  
Iteration 3:   log pseudolikelihood = -308.26194  

Conditional fixed-effects Poisson regression    Number of obs     =         88
Group variable: workstat2                       Number of groups  =         44

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(3)      =     345.15
Log pseudolikelihood  = -308.26194              Prob > chi2       =     0.0000

                                 (Std. Err. adjusted for clustering on workstat2)
---------------------------------------------------------------------------------
                |               Robust
     n_lawsuits |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
         genexp |
             1  |          0  (omitted)
             2  |          0  (omitted)
                |
     1.pre_post |  -.4420631   .0310951   -14.22   0.000    -.5030085   -.3811178
                |
genexp#pre_post |
           1 1  |   .0242712   .0502038     0.48   0.629    -.0741263    .1226688
           2 1  |   .1231934   .0654282     1.88   0.060    -.0050435    .2514303
---------------------------------------------------------------------------------

. margins genexp, dydx(pre_post) predict(nu0) noestimcheck

Conditional marginal effects                    Number of obs     =         88
Model VCE    : Robust

Expression   : Predicted number of events (assuming u_i=0), predict(nu0)
dy/dx w.r.t. : 1.pre_post

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.pre_post   |
      genexp |
          0  |  -.3572909   .0199851   -17.88   0.000    -.3964611   -.3181208
          1  |  -.3415008   .0259545   -13.16   0.000    -.3923706   -.2906309
          2  |  -.2730297   .0418494    -6.52   0.000     -.355053   -.1910065
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

. margins genexp#pre_post, predict(nu0) noestimcheck

Adjusted predictions                            Number of obs     =         88
Model VCE    : Robust

Expression   : Predicted number of events (assuming u_i=0), predict(nu0)

---------------------------------------------------------------------------------
                |            Delta-method
                |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
genexp#pre_post |
           0 0  |          1          .        .       .            .           .
           0 1  |   .6427091   .0199851    32.16   0.000     .6035389    .6818792
           1 0  |          1          .        .       .            .           .
           1 1  |   .6584992   .0259545    25.37   0.000     .6076294    .7093691
           2 0  |          1          .        .       .            .           .
           2 1  |   .7269703   .0418494    17.37   0.000      .644947    .8089935
---------------------------------------------------------------------------------

I uploaded a .png of how the data looks so you can check it out. It still doesn't look like it giving me the number of events.

Attached Files

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#20

28 Mar 2017, 21:57

Well, there is one huge, obvious problem in your data: there shouldn't be any observations where pre_post is missing. Worse yet, in your data they constitute the majority of the lawsuits. You have to fix that!

Putting aside the limitations of the data, the outputs look reasonable to me. Remember that -xtpoisson, fe- is a purely within state estimator. Working with the n_lawsuits vs apology analyses, notice that in the first -margins- output, the pre-intervention outputs are just arbitrarily set to 1. So you have to look at the post-intervention numbers in that table as being relative to the (unknown and unestimable) pre-intervention value. (Yes, this model is very complicated.) Looking at the second -margins- output, we can see that in both treatment and control groups, the number of lawsuits is a bit lower in the post- period than in the pre-period. The amount of decrease is about 0.36 in the control group and about 0.31 in the intervention group.

But, as I say, there is something seriously wrong with your data that needs to be fixed. I do not think that the -collapse- command created these missing values of pre_post. They must have been pre-existing in the data. That means that they were present in the earlier analyses of the other outcomes--and those will also be wrong as a result.

In the future, please don't post screen shots of your data. If I needed to work with this in Stata, there is no way to import the data from a screen shot. And I know I would not have had the motivation to type the data into the data editor. The helpful way to show example data is with the -dataex- command. You can install it by running -ssc install dataex- and then run -help dataex- to read the simple instructions for using it. Use -dataex- whenever you show example data here on Statalist. It enables the people who are helping you to create a completely faithful replica of your example data in Stata with a simple copy/paste operation.
Comment
Issy Ojalvo

Join Date: Mar 2017

Posts: 22
#21

29 Mar 2017, 09:22

I ended up deleting any observation that was not marked 0 or 1 by pre_post and it didn't make a difference to the data. I have about 34 years worth of lawsuits in the data set, but for the sake of this experiment i was only looking at that 1-3 years before implementation of the law and 3-5 years after- unless you think in general I should've widened the constraints. Problem with that is that It would pretty much eliminate any states that implemented their law after 2010 as the dataset doesn't have the full 2016 data. I already had to eliminate 6 states from this study as a result of implementing their laws too recently.

The regressions for the number of lawsuits worked, so I think I am mostly set with the data for now unless you think I have to change it. Now on to the next step- I need to figure out what is significant (the P>|z| being at 0.080 for the payments after genexp was introduced kills me), and how to properly graph this into something like this:

Thanks for everything!

Attached Files
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#22

29 Mar 2017, 12:29

So, the missing values for pre_post correspond to years that were "not in universe" for the study. That's fine, then. I would have dropped them from the data set before running the analysis. The results would be the same either way since they are automatically excluded from the analysis by virtue of having a missing value on a model variable. But it makes the code clearer if you explicitly drop them (or explicitly exclude them with an -if- condition on the regression command. If you have to go back and review and explain your code 6 months from now, it may not be obvious what happened.

Anyway, with regard to graphing your results, you can run the -marginsplot- command immediately after the corresponding -margins- command. If you don't like the particular way that Stata lays out the graph, you can use pretty much any -graph twoway- option with your -marginsplot- command in order to customize the appearance of the graph to your preferences.
Comment
Issy Ojalvo

Join Date: Mar 2017

Posts: 22
#23

29 Mar 2017, 18:11

Okay, can you help me understand what the data I entered relates to?

For the first one:

Code:

xtreg payment2 i.apology##i.pre_post, fe vce(cluster workstat2)

I understand it is a regression comparing apology to before and after. "fe" makes it fixed-effects and keeps the regression within the same group of subjects. Cluster was so that each individual state bias could be taken into account. The P > t is not significant in any of the regressions, so does that mean the their is no correlation? Also do I care about sigma_u, sigma_e, and rho?

Next for

Code:

margins disclos#pre_post, noestimcheck

every data set seems to be signifcant, but all it really shows is the predicted values of before and after in an easy to digest way. Why does it have to predict, and why can't the data in itself be significant and used?

Finally

Code:

margins disclos, dydx(pre_post) noestimcheck

shows the derivative, which we can just say is the predicted change over time of the regression. Some of these results are significant at a 95% confidence level, some aren't. Is this the most important part of the data?

Sorry for all the questions, just getting ready to explain the analysis and I wish I was better at stats to understand all this haha.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#24

29 Mar 2017, 20:01

The P > t is not significant in any of the regressions, so does that mean the their is no correlation?

This is a very common misconception. In fact, probably the majority of people who use the term "statistical significance" would think that. That's why I teach my students never to use the term and tell them to ignore p-values in nearly all situations. Each p-value in the results you see represents a test of a null hypothesis. Specifically it tests the null hypothesis that the population value of the coefficient (or predicted value or marginal effect in the -margins- output) in that row of the output table is zero. If the result is not statistically significant the interpretation is that the magnitude of that estimation is close enough to zero, and the imprecision of the estimate derived from the data is large enough, that we cannot confidently exclude the possibility that the population value is zero. It does not affirm that the value is zero or that there is "no correlation." It just says we can't tell if the correlation is zero or not, and we can't even tell whether it's positive or negative with adequate confidence. It is a statement that the uncertainty about the zero-ness of that parameter of the model remains unresolved.

Now, if you think about that, it is, in general, not a particularly useful conclusion to be able to draw. In fact, when you do get a statistically significant result, the conclusion runs "the magnitude of the estimated value is large enough and the precision with which it is estimated is small enough, that these sample data would be unlikely to have arisen in a world where the population value of the parameter is zero." And if you think about that, that's not a very useful conclusion either.

But it gets worse. Many of these null hypotheses about zero parameters are just straw men. For example, in the -margins- outputs that show predicted values in the groups in each time period, could anyone seriously entertain a (null) hypothesis that the mean payment amount in a lawsuit is zero? Of course not. It's a preposterous null hypothesis. That's why, even if you really believe in and like p-values, the p-values in that table should be ignored. Only if you are working in some problem where the population average in some group(s) in your data might actually be zero, and where it is of some importance to know whether that is the case or not, would there be any point in looking at the p-values in that table. Such situations are uncommon, and yours isn't one of them. Those p-values are waste of ink/pixels.

And there is more bad news. Some of the parameters in a model are nuisance parameters. They are there because they are needed to estimate the parameters of interest, but are of no interest themselves. In your case, because you are using fixed-effects regression, the most egregious of the nuisance parameters is omitted due to colinearity with the fixed effects. So you don't have this issue.

Anyway, you have to focus on your research goal which is to estimate the effectiveness of apology legislation: is it associated with a change in payments, number of lawsuits, etc. The line of the regression results table that gives you that estimate is the line where you have the coefficient of 1.apology#1.pre_post. This is your "pay dirt." This is the difference in differences. That is, it is the difference between:
a) the difference between pre- and post- legislation expected payments in the states that adopted an apology law, and
b) the difference between pre- and post- legislation expected payments in the states that did not adopt an apology law.

That coefficient is your DID estimate of the effect of adopting an apology law on payments; it is the most focal statistic in all of the output. The 95% confidence interval is a reasonable way to see how precisely your data have enabled you estimate that difference. It gives you a sense of the range of values of the actual population-level difference in difference that are consistent with getting a sample of data that looks like yours (barring some freak occurrence.) So you should look at that estimate and that interval and ask yourself the following questions:

How big is it in practical terms? Is it big enough to matter? Does this effect justify whatever costs and downsides (if any) come with adopting the legislation. Is our estimate precise enough that the answers to these questions wouldn't change if the actual effect were nearer to either end of the confidence interval?

If the answers to those questions are yes, then you can proclaim the policy a success (at least with respect to this outcome, and so far.). If not, then there are several possibilities. The effect was too small to matter practically and the confidence interval was narrow enough that even the "best case" result would be of no practical importance. Then you could proclaim the policy a failure (at least with respect to this outcome, and so far.) Or perhaps the estimate was not all that small (or was small) but the confidence interval was very wide, so that if the actual effect were near the opposite borders of the confidence interval you would reach opposite conclusions about the usefulness of the legislation. Then in that case you can say that the results are inconclusive and more research would be needed to answer the question.*

If you reach the main conclusion that adopting the legislation was associated with changes large enough, and precisely enough estimated, to matter, then you probably will also want to show your audience the output that's in the -margins- tables. They would be interested in what the average payments were in both groups before and after the legislation date, and they would probably want to know how large a change took place after the law was adopted in each group. That's what those tables give you. I would present those by giving the estimates and the confidence intervals, or the estimates and the standard errors.

So that's how I approach interpreting these models. Notice that I have not used the term "significant" except to disparage it. It plays no useful role in this kind of situation. All it does is confuse people and mislead them into thinking that a non-significant result means no effect. (And there are many other fallacies associated with p-values and statistical significance.)

My advice is to review all of your outputs for the different outcomes and the different predictors from this perspectiive. When you've done that and have understood what your data is telling you, then, if you have nothing better to do with your time, you might go back and look at the p-values, just for laughs

*I'm being a little simplistic here. All of this assumes that the model is a good one and the data were appropriate collected and so on. You might decide, on deeper consideration of the sampling methods, or the measurements of the outcomes, or other aspects of study design that the data weren't very good and you don't want to draw any conclusion from them at all. I'm simply abstracting away from these design issues and discussing how to interpret the statistics on the assumption that the design is strong.
.
1 like
Comment

Issy Ojalvo

Join Date: Mar 2017
Posts: 22

#25

30 Mar 2017, 12:32

Wow thanks for the very lengthy response, if only the professors at the med school I attend will be as good as you!

So let's go through two examples:

Code:

. xtreg payment2 i.genexp##i.pre_post, fe vce(cluster workstat2)
note: 1.genexp omitted because of collinearity
note: 2.genexp omitted because of collinearity

Fixed-effects (within) regression               Number of obs     =     62,701
Group variable: workstat2                       Number of groups  =         44

R-sq:                                           Obs per group:
     within  = 0.0007                                         min =         75
     between = 0.0594                                         avg =    1,425.0
     overall = 0.0053                                         max =      9,653

                                                F(3,43)           =       1.26
corr(u_i, Xb)  = 0.2796                         Prob > F          =     0.2989

                                (Std. Err. adjusted for 44 clusters in workstat2)
---------------------------------------------------------------------------------
                |               Robust
       payment2 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
         genexp |
             1  |          0  (omitted)
             2  |          0  (omitted)
                |
     1.pre_post |  -4541.811   12249.42    -0.37   0.713    -29245.12     20161.5
                |
genexp#pre_post |
           1 1  |  -41235.08   28872.65    -1.43   0.160    -99462.34    16992.17
           2 1  |  -4663.423   17158.73    -0.27   0.787    -39267.31    29940.46
                |
          _cons |   317271.1   4899.144    64.76   0.000     307391.1    327151.2
----------------+----------------------------------------------------------------
        sigma_u |  89418.111
        sigma_e |  585450.14
            rho |  .02279588   (fraction of variance due to u_i)
---------------------------------------------------------------------------------

. 
. margins genexp#pre_post, noestimcheck

Adjusted predictions                            Number of obs     =     62,701
Model VCE    : Robust

Expression   : Linear prediction, predict()

---------------------------------------------------------------------------------
                |            Delta-method
                |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
genexp#pre_post |
           0 0  |   317271.1   4899.144    64.76   0.000       307669    326873.3
           0 1  |   312729.3   11766.09    26.58   0.000     289668.2    335790.4
           1 0  |   317271.1   4899.144    64.76   0.000       307669    326873.3
           1 1  |   271494.2   21752.82    12.48   0.000     228859.5      314129
           2 0  |   317271.1   4899.144    64.76   0.000       307669    326873.3
           2 1  |   308065.9   11669.46    26.40   0.000     285194.2    330937.6
---------------------------------------------------------------------------------

. 
. margins genexp, dydx(pre_post) noestimcheck

Conditional marginal effects                    Number of obs     =     62,701
Model VCE    : Robust

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 1.pre_post

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.pre_post   |
      genexp |
          0  |  -4541.811   12249.42    -0.37   0.711    -28550.23    19466.61
          1  |  -45776.89    26145.4    -1.75   0.080    -97020.93    5467.141
          2  |  -9205.234   12015.57    -0.77   0.444    -32755.32    14344.85
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

According to the coefficients, the states that had the genexp law introduced is estimated to have a decrease in average malpractice payments by $41,235, those with another type of apology law saw a decrease by $4663, and those without any apology law saw a decrease in $4541 (this is probably wrong, does the other coefficients add in with this one?). Although the confidence intervals are wide, the huge decrease in the average payments of genexp states compared to the others would make feel like this law was very successful. The numbers of the margins are a bit differently, but the pattern is similar, so I'm going to guess those are the actual number and not the predicted numbers, and the confidence interval showing where they are certain the true number lies. Finally the dy/dx is the slope of the linear regression line.

One thing I don't understand is why the margins and dy/dx for states without any apology laws differ in each regression. Shouldn't they stay the same while the other linear regressions change? Not even the margins are the same?

Next:

Code:

. xtpoisson n_lawsuits i.genexp##i.pre_post, fe vce(robust)

Iteration 0:   log pseudolikelihood =  -1548.224  
Iteration 1:   log pseudolikelihood = -308.50506  
Iteration 2:   log pseudolikelihood = -308.26194  
Iteration 3:   log pseudolikelihood = -308.26194  

Conditional fixed-effects Poisson regression    Number of obs     =         88
Group variable: workstat2                       Number of groups  =         44

                                                Obs per group:
                                                              min =          2
                                                              avg =        2.0
                                                              max =          2

                                                Wald chi2(3)      =     345.15
Log pseudolikelihood  = -308.26194              Prob > chi2       =     0.0000

                                 (Std. Err. adjusted for clustering on workstat2)
---------------------------------------------------------------------------------
                |               Robust
     n_lawsuits |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
         genexp |
             1  |          0  (omitted)
             2  |          0  (omitted)
                |
     1.pre_post |  -.4420631   .0310951   -14.22   0.000    -.5030085   -.3811178
                |
genexp#pre_post |
           1 1  |   .0242712   .0502038     0.48   0.629    -.0741263    .1226688
           2 1  |   .1231934   .0654282     1.88   0.060    -.0050435    .2514303
---------------------------------------------------------------------------------

.
. margins genexp#pre_post, predict(nu0) noestimcheck

Adjusted predictions                            Number of obs     =         88
Model VCE    : Robust

Expression   : Predicted number of events (assuming u_i=0), predict(nu0)

---------------------------------------------------------------------------------
                |            Delta-method
                |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
genexp#pre_post |
           0 0  |          1          .        .       .            .           .
           0 1  |   .6427091   .0199851    32.16   0.000     .6035389    .6818792
           1 0  |          1          .        .       .            .           .
           1 1  |   .6584992   .0259545    25.37   0.000     .6076294    .7093691
           2 0  |          1          .        .       .            .           .
           2 1  |   .7269703   .0418494    17.37   0.000      .644947    .8089935
---------------------------------------------------------------------------------

.
. margins genexp, dydx(pre_post) predict(nu0) noestimcheck

Conditional marginal effects                    Number of obs     =         88
Model VCE    : Robust

Expression   : Predicted number of events (assuming u_i=0), predict(nu0)
dy/dx w.r.t. : 1.pre_post

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.pre_post   |
      genexp |
          0  |  -.3572909   .0199851   -17.88   0.000    -.3964611   -.3181208
          1  |  -.3415008   .0259545   -13.16   0.000    -.3923706   -.2906309
          2  |  -.2730297   .0418494    -6.52   0.000     -.355053   -.1910065
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

The coefficients for both laws with genexp and general apology laws are positive, but the pre_post coefficient is at -.44, meaning that having genexp would make the linnear regression coefficient be at ~-.41 and the states with general apology laws at ~-.32. According to the first margins and the dy/dx, states without an apology saw their number of lawsuits decrease by about 35% (is that a fair way of using the data?), genexp states decrease by 34%, and those with general apology laws decrease by 27%. While the number of lawsuits did not decrease beyond those without apology laws, it did decrease more than the states that had other types of apology laws. Making this a half-win.

Last edited by Issy Ojalvo; 30 Mar 2017, 12:34.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#26

30 Mar 2017, 13:44

According to the coefficients, the states that had the genexp law introduced is estimated to have a decrease in average malpractice payments by $41,235, those with another type of apology law saw a decrease by $4663, and those without any apology law saw a decrease in $4541 (this is probably wrong, does the other coefficients add in with this one?).

Your interpretation is, indeed. You are trying to read marginal effects out of the coefficient table. The amount be which malpractice payments decreased in each group are found in your second -margins- output. In the group with genexp = 0 (no apology law?), the decrease was about $4541. In the group with genexp = 1, the decrease was $45,777, and in the group with genexp = 2, the decrease was $92.05. Although I have only minimal expertise in the economics of malpractice suits, I do believe that a decrease of $92.05 is negligible. A decrease of $4,541 is not neglible in its own right, but relative to a baseline value of over $317,000 (a round number based on the results in your first -margins- output), this is pretty small potatoes. On the other hand a decrease of $45,777 is about a 14% difference, which is probably noteworthy. Now, the confidence interval around this $45,777 is pretty wide: so this effect has been estimated with rather limited precision: we can't even really be sure it's a decrease and not a (very small) increase. Clearly if it were really a decrease of $97,021 that would be quite remarkable indeed. But if it were an increase of $5467, then that would certainly be a disappointment. (I'm assuming that the intended goal is to decrease, not increase, payments, am I right?) So my conclusion here would be that there is suggestive evidence that the adoption of a genexp = 1 policy is associated with a decrease in malpractice payments, most likely in the vicinity of $45,777. But due to limited sample size and high variability in malpractice payments themselves, this estimate is very imprecise, and the data would be consistent with an estimate over twice as large, or with an estimate that goes slightly in the opposite direction. Suggestive, but inconclusive study. Further research is needed here.

The coefficients for both laws with genexp and general apology laws are positive, but the pre_post coefficient is at -.44, meaning that having genexp would make the linnear regression coefficient be at ~-.41 and the states with general apology laws at ~-.32.

I'm not following you here.

According to the first margins and the dy/dx, states without an apology saw their number of lawsuits decrease by about 35% (is that a fair way of using the data?), genexp states decrease by 34%, and those with general apology laws decrease by 27%. While the number of lawsuits did not decrease beyond those without apology laws, it did decrease more than the states that had other types of apology laws.

Yes, as far as it goes. But, again, the precision of the estimates is also important. The confidence intervals largely overlap each other. So it isn't clear cut even that genexp = 2 is associated with less of a change than genxp = 1. What if the real change with genexp = 2 were a decrease of 35.5% and that of genexp = 1 were 29.1%. Then even the rank ordering would be reversed. So this isn't very conclusive at all. To get a better handle on this, I would now run:

Code:

margins genexp, dydx(pre_post) predict(nu0) noestimcheck pwcompare

The addition of the pwcompare option will cause Stata to directly contrast each of those three marginal effects with each of the others and give you confidence intervals for the differences between those effects. If the differences between those effects have narrow confidence intervals, then you will be able to draw clear conclusions about differences between those interventions.
Comment
Issy Ojalvo

Join Date: Mar 2017

Posts: 22
#27

30 Mar 2017, 20:50

In the group with genexp = 0 (no apology law?), the decrease was about $4541. In the group with genexp = 1, the decrease was $45,777, and in the group with genexp = 2, the decrease was $92.05. Although I have only minimal expertise in the economics of malpractice suits, I do believe that a decrease of $92.05 is negligible. A decrease of $4,541 is not neglible in its own right, but relative to a baseline value of over $317,000 (a round number based on the results in your first -margins- output), this is pretty small potatoes. On the other hand a decrease of $45,777 is about a 14% difference, which is probably noteworthy.

I think you meant a decrease of $9205, right? Which is better than a decrease of $4,541, but relatively not that much. But thanks for explaining, I know how to interpret the data a bit better now. You're supposed to add the coeficients of 1 or 2 to the 1.pre_post coefficient, so don't worry about the part where you couldn't follow me.

Anyways, this is the data I got when using -pwecompare-

Code:

. margins genexp, dydx(pre_post) predict(nu0) noestimcheck pwcompare Pairwise comparisons of conditional marginal effects Model VCE : Robust Expression : Predicted number of events (assuming u_i=0), predict(nu0) dy/dx w.r.t. : 1.pre_post -------------------------------------------------------------- | Contrast Delta-method Unadjusted | dy/dx Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ 1.pre_post | genexp | 1 vs 0 | .0157902 .0327573 -.0484129 .0799933 2 vs 0 | .0842612 .0463765 -.006635 .1751574 2 vs 1 | .068471 .0492443 -.0280461 .1649882 -------------------------------------------------------------- Note: dy/dx for factor levels is the discrete change from the base level.

My interpretation would be that more than likely you could tell 0 and 1 are better than 2 if you are hoping to decrease the number of malpractice lawsuits. But the confidence interview between 0 and 1 is to varied in order to safely say that one is than the other. Thank's I'll do it for all the comparisons I am unsure of. Could you still explain why the margins and dy/dx for states without any apology laws differ in each regression. Shouldn't they stay the same while the other linear regressions change since the state compositions doesn't change?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30192
#28

30 Mar 2017, 22:53

I think you meant a decrease of $9205, right? Which is better than a decrease of $4,541, but relatively not that much.

Yes, indeed. Sorry about that misreading. When the number scrolls off the top of the screen when I am writing in the post window on the bottom, I guess I just remember the sound of "ninety-two-oh-five" and mixed up where the decimal point was.

Could you still explain why the margins and dy/dx for states without any apology laws differ in each regression.

Are the exact same states in the control group for all of the regressions? And with the same times corresponding to pre-post? That is, is it always true that apology == 0 if and only if regexp == 0 if and only if protfault == 0 whenever pre_post == 0?

Code:

assert ((apology == 0) == (regexp == 0)) & ((regexp == 0) == (protfault == 0) ) if pre_post == 0
Comment
Issy Ojalvo

Join Date: Mar 2017

Posts: 22
#29

31 Mar 2017, 13:05

Yeah it's the exact same states in the control groups. I used the average years of before and after implentation of all the other laws to decide what years to use. The states are all those without apology laws. When do I use the code that you just gave me? I get the error "regexp not found"

I made it say

Code:

assert ((apology == 0) == (genexp == 0)) & ((disclos == 0) == (protfault == 0) ) if pre_post == 0

incase that is what you meant and got the response "4,092 contradictions in 37,519 observations, assertion is false" So I'm going to double check my data now.

Edit: Ahh Mississppi somehow snuck in the genexp variable when it doesn't have an apology law. I'll edit now and see what it does to all the data.

Last edited by Issy Ojalvo; 31 Mar 2017, 13:12.
Comment

Issy Ojalvo

Join Date: Mar 2017
Posts: 22

#30

31 Mar 2017, 13:34

Well now my data is all smoothed out. Thanks for helping me catch that. It actually completely changes what my initial conclusion would be- at least for payments.

Check it out. Protfault it still the worst but now the disclos and genexp variables look way better. Should I use a test to compare the two? If not I'm ready to declare both successful.

Also I want to check how much the state of Texas decreased their malpractice payments from before and after. I realized they played a very influential role. They're state number is workstat2==54. I'm almost done with everything!

Code:

 margins genexp, dydx(pre_post) noestimcheck

Conditional marginal effects                    Number of obs     =     62,701
Model VCE    : Robust

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 1.pre_post

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.pre_post   |
      genexp |
          0  |   -5196.25   11769.18    -0.44   0.659    -28263.43    17870.92
          1  |  -46028.91    26614.4    -1.73   0.084    -98192.17    6134.352
          2  |  -9205.234   12015.57    -0.77   0.444    -32755.32    14344.85
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

. margins disclos, dydx(pre_post) noestimcheck

Conditional marginal effects                    Number of obs     =     62,701
Model VCE    : Robust

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 1.pre_post

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.pre_post   |
     disclos |
          0  |   -5196.25   11769.18    -0.44   0.659    -28263.43    17870.92
          1  |   -45733.4   27644.15    -1.65   0.098    -99914.94    8448.149
          2  |  -11750.78   11488.22    -1.02   0.306    -34267.28    10765.71
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

Last edited by Issy Ojalvo; 31 Mar 2017, 13:46.

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment