Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hello,
    The above discussion helped me a lot. Thanks. But I am wondering if we need to use zero-truncated Poisson model when we analyze this dollhill3 data or not... Since the deaths variable of 0 is not included, I think the zero-truncated model seems better. Do I understand this correct?

    Comment


    • #17
      Clyde and Colin have done a great service by working through this. I have a similar problem, but slightly different goal and was hoping somebody could help identify the precise solution.

      I have data for 2200 counties and have modeled this: nbreg fraud_loans X1 X2, exposure(total_loans) irr
      where "fraud_loans" is a count of how many mortgage loans in each county contain fraud and "total_loans" is the total number of loans in each county at risk for mortgage fraud (with some having many more loans at risk than others, or in other words variable exposure). From the model, I would like to obtain the predicted proportion of loans (fraud_loans/total_loans) at different levels of X1 (a continuous variable). The post-estimation commands I have run, however, appear to output predicted counts (at mean exposure) rather than predicted proportion (or rate). For example:

      margins, at(X1=(10(10)100)) gives the predicted counts of fraud_loans at different levels of X1. That's useful, but anybody have advice on how I can modify the command to get the predicted proportion of loans that are fraudulent (fraud_loans/total_loans) at different levels of X1? Would margins, at(X1=(10(10)100)) expression(predict(ir)) do the trick?

      Thanks, Eric

      Comment


      • #18
        Hi Colin and Clyde:
        Thanks for the discussion and is very helpful for me totry to get predicted incidence rate with CI. I am following every step that experts suggested above and understand the difference of "margins smokes, predict(ir)" and "margins, over(smokes) exp(predict(n)/py_bysmokes)". The later one with CI is what i need and is very helpful indeed. Thanks so much!
        Now i have a question here about the predicted number:
        I noticed that margins smokes, predict(n) produced differently from what is produced below (not sure if i am right)
        gen predict_smokes=exp(_b[1.smokes]+_b[agecat]*agecat+_b[_cons])*pyear
        gen predict_nsmokes=exp(_b[agecat]*agecat+_b[_cons])*pyear
        margins smokes, predict(n) gives us the result

        Delta-method
        Margin Std. Err. z P>z [95% Conf. Interval]

        smokes
        0 52.06201 5.180823 10.05 0.000 41.90778 62.21623
        1 78.16372 3.114217 25.10 0.000 72.05996 84.26747

        I assume 52.06 and 78.16 are the average predicted numbers.
        but the syntax "gen" plus egen give us:
        egen mean_smokes=mean(predict_smokes),by(smokes)
        egen mean_nsmokes=mean(predict_nsmokes),by(smokes)

        table smoke,c(sum deaths sum pyears mean mean_smokes mean mean_nsmokes)

        --------------------------------------------------------------------------
        smokes | sum(deaths) sum(pyears) mean(mean_s~s) mean(mean_n~s)
        ----------+---------------------------------------------------------------
        0 | 101 39220 30.32743 20.2
        1 | 630 142247 126 83.92401

        I cant find any similarities of the valuess obtained from margins and gen.
        Could anyone explain to me?

        Your help is much appreciated!
        Cheers

        Comment

        Working...
        X