Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Value of the *Prob > chi2

    Dear Statalist,

    I hope you are well. I have performed probit analysis in the Stata software the following regression:

    probit Nd_FUND i.FST_EXP_INTS i.FST_GWT i.FST_BP i.FST_AUD i.FST_ADV i.RE_LE_MORE6 i.RE_ST2 i.PO_GEN i.PO_CIT i.PO_EX i.PO_ED2 i.F_SEC i.F_AGE i.F_SIZE

    The above-higlighted variables are the only significant variables in the model.

    Probit regression Number of obs = 110
    LR chi2(26) = 42.24
    Prob > chi2 = 0.0232
    Log likelihood = -46.074444 Pseudo R2 = 0.3143


    I would like to ask please does the value of Prob Prob > chi2 = 0.0232 accepted to report that 'overall the model is significant' or not? Can I accept this constructed mode based to the outcomes of the goodness of fit test?

    For more clarification I got the following finding to present the goodness of fit for the model:

    constant = 0.632

    Pearson chi2(80) = 86.57
    Prob > chi2 = 0.2883


    Hosmer-Lemeshow test

    +--------------------------------------------------------+
    Group | Prob | Obs_1 | Exp_1 | Obs_0 | Exp_0 | Total |
    |-------+--------+-------+-------+-------+-------+-------|
    | 1 | 0.0203 | 0 | 0.1 | 11 | 10.9 | 11 |
    | 2 | 0.0421 | 0 | 0.4 | 11 | 10.6 | 11 |
    | 3 | 0.0912 | 0 | 0.8 | 11 | 10.2 | 11 |
    | 4 | 0.1553 | 1 | 1.4 | 10 | 9.6 | 11 |
    | 5 | 0.2182 | 4 | 2.3 | 8 | 9.7 | 12 |
    |-------+--------+-------+-------+-------+-------+-------|
    | 6 | 0.2902 | 4 | 2.5 | 6 | 7.5 | 10 |
    | 7 | 0.3756 | 5 | 3.7 | 6 | 7.3 | 11 |
    | 8 | 0.5251 | 2 | 4.9 | 9 | 6.1 | 11 |
    | 9 | 0.7812 | 8 | 6.8 | 3 | 4.2 | 11 |
    | 10 | 0.9937 | 9 | 9.8 | 2 | 1.2 | 11 |
    +--------------------------------------------------------+

    number of groups = 10
    Hosmer-Lemeshow chi2(8) = 9.16
    Prob > chi2 = 0.3287


    Many thanks for your support

    Kind regards,
    Rabab

  • #2
    The Prob > chi2 = 0.0232 from the -probit- command is a significance test of the joint null hypothesis that all model coefficients are zero. This is sometimes, by abuse of language, abbreviated as "the model is significant," whatever that is supposed to mean. It is rare that the null hypothesis of all model coefficients being zero is of any interest to anybody. Usually we are interested in the effects of some of the variables in the model, whereas others are included to adjust for their nuisance contributions to the outcome variable. This chi square statistic, however, has nothing at all to do with model fit.

    The Prob > chi2 = 0.3287 that you see in the output of -estat, gof- is the Hosmer-Lemeshow chi square and it is a goodness of fit test. The idea behind it is that in a model that fits the data well, this p-value will be high. Personally, I don't think that the p-values of the Hosmer-Lemeshow test should be used in this way because in a large sample (not a worry for your small N of 110) the test can easily come out to be "statistically significant" even though the departure of the model from the data is small enough to ignore for practical purposes. (And the opposite can also happen: in a very small data set, you can fail to get a "statistically significant" result from the Hosmer-Lemeshow test even for a model that fits the model terribly, just because the n's in each decile are too small.) I think you are better advised to look at the table that got printed out and compare the Obs_1 and Exp_1 columns to see if they are close enough to each other to suit your purposes. It is also a good idea to see if the agreement between Obs and Exp shows a pattern. That is, sometimes the overall chi square suggests the model fit is acceptable, but closer inspection reveals that there is good agreement between Obs and Exp in the middle deciles, but it doesn't work well at the extremes, or vice versa. Patterns like that often suggest that your model lacks a key variable, or perhaps needs the inclusion of interactions or non-linear specifications of some variable(s).

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      The Prob > chi2 = 0.0232 from the -probit- command is a significance test of the joint null hypothesis that all model coefficients are zero. This is sometimes, by abuse of language, abbreviated as "the model is significant," whatever that is supposed to mean. It is rare that the null hypothesis of all model coefficients being zero is of any interest to anybody. Usually we are interested in the effects of some of the variables in the model, whereas others are included to adjust for their nuisance contributions to the outcome variable. This chi square statistic, however, has nothing at all to do with model fit.

      The Prob > chi2 = 0.3287 that you see in the output of -estat, gof- is the Hosmer-Lemeshow chi square and it is a goodness of fit test. The idea behind it is that in a model that fits the data well, this p-value will be high. Personally, I don't think that the p-values of the Hosmer-Lemeshow test should be used in this way because in a large sample (not a worry for your small N of 110) the test can easily come out to be "statistically significant" even though the departure of the model from the data is small enough to ignore for practical purposes. (And the opposite can also happen: in a very small data set, you can fail to get a "statistically significant" result from the Hosmer-Lemeshow test even for a model that fits the model terribly, just because the n's in each decile are too small.) I think you are better advised to look at the table that got printed out and compare the Obs_1 and Exp_1 columns to see if they are close enough to each other to suit your purposes. It is also a good idea to see if the agreement between Obs and Exp shows a pattern. That is, sometimes the overall chi square suggests the model fit is acceptable, but closer inspection reveals that there is good agreement between Obs and Exp in the middle deciles, but it doesn't work well at the extremes, or vice versa. Patterns like that often suggest that your model lacks a key variable, or perhaps needs the inclusion of interactions or non-linear specifications of some variable(s).


      Dear Clyde,

      Thank you very much for this clarification


      Kind regards,
      Rabab

      Comment

      Working...
      X