Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Marginal effects significance vs original model effects significance

    I often get questions like "This variable has significant effects in the original (logit/probit/heckprobit/whatever) model. But its marginal effect is not significant. Why?" Or vice-versa. I tend to just say that different hypotheses are being tested. But can somebody give a more elegant or complete explanation?

  • Richard Williams
    replied
    Teresa, welcome to Statalist.

    i suggest reading the Statalist FAQ, esp. point 12 about asking Qs effectively. Showing exactly what you typed and how Stata responded can make it easier to see what you are talking about.

    Also, while I might add a link to old threads, I like to start a new thread rather than add on to a long older one. If something has 20+ posts I usually don’t want to go to the trouble of getting up to date on what has already been talked about.

    i personally do not get too surprised about difference in significance levels between coefficients and marginal effects. Marginal effects can be computed in many ways, e.g. atmeans, asobserved, or at values chosen by the user. These different ways might produce different significance levels. I usually just focus on the significance of the coefficients.

    Leave a comment:


  • Teresa Schützeichel
    replied
    Following this thread, I am still confused as to why the significance levels of the ivprobit original coefficients and those of its average marginal effects should not be identical. I ran an ivprobit which returned positive and significant coefficients (varying p-values for the different specifications), however the corresponding average marginal effects after running margins, dydx(*) predict (pr) were all positive and insignificant (p>0.1 for all specifications).
    The other thing is the command 'margins, dydx (*) predict (pr)' returns the average marginal effects of the instrumental variable as well, ideally this should not happen..any idea why this is the case?
    Thanks in advance.

    Leave a comment:


  • Edmondo Ricci
    replied
    Clyde Schechter

    Dear Clyde,

    Thank you again

    Leave a comment:


  • Clyde Schechter
    replied
    I'm unable to respond to your specific question because I do not use instrumental variables in my work and I have only a very limited understanding of how they work.

    I can say that with linear models, the marginal effect is equal to the coefficient. With non-linear models, the marginal effect must be conditioned on particular values of the predictor variables (or averaged over the distribution of the predictor variables) and can vary considerably, whereas regression coefficients are unconditional. So there is no necessary type of agreement between a regression coefficient and all of the infinitely many marginal effects associated with that variable.

    Leave a comment:


  • Edmondo Ricci
    replied
    Clyde Schechter

    Dear Clyde,

    Thank you for your post. Since tests on marginal effect and coefficients are different tests, I can see statistical significance can be different.

    But would it be even possible that coefficients are positive and marginal effect is negative?

    I managed to make a dataset where I get following results.

    1. reg => positive
    2. ivreg => positive
    3. probit coefficient => positive
    4. probit marginal effect => positive
    5. ivprobit coefficient => positive
    6. ivprobit marginal effect => negative

    This is strange in so many levels. #5 and #6 having different sign is strange. #2 and #6 having different sign is strange.

    And if this is possible, in which situation would it occur?

    And if this is possible, shouldn't there be somewhere in the range of x such that ivprobit marginal effect is positive?

    And how should I interpret this? Is X causing increase in Y or decrease in Y? All others are positive & significant. Only marginal effects from ivprobit is negative (and sometimes significant)
    Last edited by Edmondo Ricci; 29 Mar 2019, 00:09.

    Leave a comment:


  • Evelyn Mare
    replied
    I have encountered a similar problem. My logit models suggests a p value of <.05 and my logit model a p value >.15

    Originally posted by Richard Williams View Post
    As I already stated, my preference is to use the p-values from the original coefficients. I think that is also what is more common.
    I would lean towards presenting the AMEs, because they are easier to interpret, yet I am wrangling with the insignificance.

    Code:
     mi est, post: svy: logit ro i.d_e i.pel  d_pov i.p_bsc i.p_bnvq_s4  i.involv_s4 i.p_sy_s4  c.p_age_sample_s1##c.p_age_sample_s1 i.p_sex_s4
    mimrgns , dydx(i.pek ) cmdmargins predict(pr)
    Code:
      
      Logit coefficients     
    Multiple-imputation estimates Imputations = 30
    Survey: Logistic regression Number of obs = 4,460
    Number of strata = 3 Population size = 4,549.841
    Number of PSUs = 200
    Average RVI = 0.3858
    Largest FMI = 0.9258
    Complete DF = 197
    DF adjustment: Small sample DF: min = 2.31
    avg = 175.31
    max = 195.02
    Model F test: Equal FMI F( 34, 157.1) = 90.89
    Within VCE type: Linearized Prob > F = 0.0000
    ro Coef. Std. Err. t P>t [95% Conf. Interval]
    d_pel
    stage1 1.556983 1.072342 1.45 0.148 -.558007 3.671973
    stage2 -.5627057 .9420055 0.60 0.551 -2.420539 1.295127
    stage3 2.444653 1.197903 2.04 0.043 .0821298 4.807177
    stage4 .3289372 1.40129 0.23 0.815 -2.434705 3.092579
    stage5 -.6252651 1.675261 0.37 0.709 -3.929289 2.678759
    Code: Margins
    dy/dx Std. Err. t P>t [95% Conf. Interval]
    d_pel
    stage1 .1470687 .1426974 1.03 0.304 -.1343723 .4285098
    stage2 -.0241331 .0347124 -0.70 0.488 -.0925933 .0443271
    stage3 .2947183 .218825 1.35 0.180 -.1368519 .7262885
    stage4 .0199796 .0946079 0.21 0.833 -.1666073 .2065665
    stage15 -.0261499 .0563228 -0.46 0.643 -.1372328 .084933
    Note: dy/dx for factor levels is the discrete change from the base level.

    Leave a comment:


  • Richard Williams
    replied
    As I already stated, my preference is to use the p-values from the original coefficients. I think that is also what is more common. Many analyses do not even present adjusted predictions or marginal effects, but you just about always have the coefficients.

    But I am not the ultimate authority on such things, so you should decide what is best for you given what you want to test.

    Leave a comment:


  • Elise Sobrie
    replied
    Dear all

    After reading a couple of times your posts, I understand what you are saying. However, it is not clear to me whether to follow the p-values of the margins or those from the coefficients in the Original model. I know different hypotheses etc. are beign tested, but just one clear answer which to follow?

    Thank you in advance!

    Leave a comment:


  • Chandra Shah
    replied
    Thanks for the latest comments.
    As you rightly point out Richard, the results are in the same ball park without multiple imputation and replicate weights. But I only know this after having estimated the models both ways! There were 10% missing values in aggregate across all variables. As Paul Allison noted, one should impute missing values whenever possible. Also the use of replicate weights and plausible values is what the OECD recommend for the analysis of the PIAAC data that I am using.

    Leave a comment:


  • Sebastian Kripfganz
    replied
    Originally posted by Stephen Jenkins View Post
    +1 to the post from Clyde. Very nicely put.
    Marginal effects are typically (but not always) non-linear functions of all the estimated parameters and explanatory variables. So even if particular coefficient or OR is "statistically significant", it doesn't guarantee that the marginal effect associated with that coefficient is "statistically significant". For that reason, you can get some of the features described by Clyde in #2.
    From a technical perspective, Stephen's answer is already sufficient. Coefficients and marginal effects are different quantities, and the latter are often non-linear combinations of the former. Typically, one would expect the p-values to be similar but there is no reason why they have to be close or even identical.

    With regard to the discussion whether to look at coefficients or p-values, we should ask: Why are we interested in the respective test result? The coefficients themselves in non-linear models typically do not have a good interpretation when we want to examine the effect of a certain covariate on our outcome variable. When interpreting the results, we would typically look at the marginal effects. Yet, testing for statistical significance of the coefficient estimates can be relevant when we think about the model specification, whether to include or exclude a certain covariate.

    A variable could be statistically relevant in the sense that its coefficient estimate is statistically significant but at the same time its marginal effect might be statistically insignificant. The latter does not mean that this variable does not matter because it also affects the marginal effects of all the other covariates. In other words, including it still improves the fit of the model.

    Leave a comment:


  • Richard Williams
    replied
    That kind of complexity makes my head hurt! I might try skipping a few of the bells and whistles (e.g. do it without the imputation, or without the replicate weights) and see if that changes anything. I wouldn't expect it to, but a mistake somewhere along the way (either by you or by Stata) might affect the results.

    Leave a comment:


  • Chandra Shah
    replied
    Thanks Stephen, just found the Advanced editor!
    I have used Stata capabilities you refer to for estimating my bivariate model with sample selection. Both the outcome and selection equations are important for the research questions I am investigating. Furthermore I have replicate weights, plausible values and missing data to deal with. For each plausible value I have three multiple imputed data sets, making 30 multiple imputed data. Incidentally the two variables for which the p-values of the coeff. and the marginal effect are radically 'far' apart are both continuous.
    And thanks everybody for your contribution to this thread.

    Leave a comment:


  • Stephen Jenkins
    replied
    Sorry I can't figure out how to do the quotes like in the above posts!
    Chandra: please read the FAQ from top to bottom. (Hit the black bar at the top of the page.) Your post #11 suggests you have mastered how to do the "quote" inserts, but also read about using CODE delimiters. Also read how to use the Advanced editor and its functionality (accessed by clicking on the underlined upper-case A in the editing box for composing messages). With that you can insert hyperlinked URLs

    For Stata's capabilities for estimating marginal effects for a "bivariate probit with sample selection", see help heckprobit_postestimation. Also read the associated manual entry for information about methods and formulae. The "complexity" you refer to exists of course, but StataCorp have done a lot of work for you.

    Leave a comment:


  • Chandra Shah
    replied
    Originally posted by Chandra Shah
    No problem.
    Quote from William Greene: "An empirical conundrum can arise when doing inference about partial effects rather than coefficients. For any particular variable, wk, the preceding theory does not guarantee that both the estimated coefficient, θk and the associated partial effect, δk will both be ‘statistically significant,’ or statistically insignificant. In the event of a conflict, one is left with the uncomfortable problem of simultaneously rejecting and not rejecting the hypothesis that a variable should appear in the model. Opinions differ on how to proceed. Arguably, the inference should be about θk, not δk, since in the latter case, one is testing a hypothesis about a function of all the coefficients, not just the one of interest."

    page 12:
    http://archive.nyu.edu/bitstream/2451/26036/2/7-7.pdf

    Also, see:
    Dowd, BE, Greene, WH & Norton, EC 2014, 'Computation of Standard Errors', Health Services Research, vol. 49, pp. 731-750.
    where the authors discuss the issue of consistency of the estimator of variance- covariance matrix and the complexity involved in calculating the standard error of the a marginal effect in a multiple equation model (e.g. bivariate probit with sample selection). If full information maximum likelihood is used then the estimator is consistent.

    Sorry I can't figure out how to do the quotes like in the above posts!

    Leave a comment:

Working...
X