Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Average marginal effects - extremely large results

    I'm running a two stage residual inclusion, and evaluating some average marginal effects post estimation and running into an issue of extremely large estimates following the Margins dy/dx command. I'm hoping this is a simple user error that I'm struggling to spot as a relatively naive stata user.

    Code:
    *MODEL 3
    
    *************************************************/
    glm all_opioid_sum ib0.female ib1.race ib2.ethnicity_num ib0.insurance  ib0.dx_premature ib0.px_chd ib0.nonnecabd ib0.mednec ib0.surgnec ib0.dx_vlbw ib0.dx_elbw ib0.dx_hie ib0.ecmo move_avg_1year, family(gamma) link(log) vce(cluster hospital_number)
    /*************************************************
    ** Save the first stage residuals. **
    *************************************************/
    predict Xuhat3, response
    /*************************************************
    ** Apply GLM for the 2SRI second stage. **
    *************************************************/
    glm infla_total_suc all_opioid_sum ib0.female ib1.race ib2.ethnicity_num ib0.insurance  ib0.dx_premature ib0.px_chd ib0.nonnecabd ib0.mednec ib0.surgnec ib0.dx_vlbw ib0.dx_elbw ib0.dx_hie ib0.ecmo Xuhat3, family(gamma) link(log) vce(bootstrap, reps(50) cluster(hospital_number) bca)
    margins, dydx(*)
    Click image for larger version

Name:	code2.PNG
Views:	1
Size:	114.5 KB
ID:	1728661

    Click image for larger version

Name:	code3.PNG
Views:	1
Size:	118.2 KB
ID:	1728662



    Other very similar models on the same data set just adding in additional covariates yield very different and more reasonable results in a scale that makes sense. Any insight as to why these marginal effects are so extremely large?

    For reference, essentially the same model but including a few more clinical covariates of interest yield results that are more reasonable.
    Click image for larger version

Name:	code1.PNG
Views:	1
Size:	143.7 KB
ID:	1728663


  • #2
    what's the scale of the dependent variable?

    Comment


    • #3
      The dependent variable is continuous and reflects standardized costs associated with a patient encounter.

      HTML Code:
      Variable       | Obs         Mean        Std. Dev.      Min           Max
      ------------------------------------------------------------------------------------------------
      infla_tota~c | 126,848       213233.1       253315.4      1201.961      9865631
      Last edited by Shadassa Ourshalimian; 30 Sep 2023, 22:34.

      Comment


      • #4
        With a mean of 213,233 and a max of 9.9 million, I'm not sure the marginal effects are all that large.

        Comment


        • #5
          The cost data is highly skewed with some extreme cases, however the output feels strange given the scientific notation - unless I'm mistaken, looking at the output (2nd screenshot) for the variable dx elbw (diagnosis for extremely low birthweight) results in an estimated patient cost of 4.43x10^33 or put another way 443000000000000000000000000000000.

          The last image, is the same model but adding in a few additional clinical covariates of interest and when I do that the marginal effects seem more in line with expectations of say an increase in 75,230 for a patient in the NICU/ICU.
          Last edited by Shadassa Ourshalimian; 01 Oct 2023, 11:34.

          Comment


          • #6
            Sorry, I was looking at the last table. There's obviously something wrong in the second table. Stata can't compute the margins from some reason.

            Might try with the simplest model and keep adding variables and see when it blows up. That might shed some light.

            Comment


            • #7
              Appreciate the suggestion, I ran a series of 18 models this evening adding in variables in a stepwise fashion. It appears to run completely fine up until I add in the variable nonnecabd then the models starts to generate these huge marginal estimates but returns to "normal" with the addition of the ccc_count2 variable. the variables Nonnecabd thru nicu_icu are just dummy variables coded 0/1. It's very strange that the average marginal effects for the final more complex model computes fine. Unfortunately, nothing jumps out at me as to why those particular variables would be causing such a breakdown but then behave in the final model.

              Comment


              • #8
                perhaps some type of singularity of some sort, but resolved with ccc_count2.

                run on the subsamples of nonnecabd and see what happens.

                Comment


                • #9
                  Hmm, I'll give that a shot.

                  I was able to get the marginal estimates to compute reasonable results using the atmeans option, perhaps that sheds some light on this situation though it's still strange to me that the beta estimates from the model and SEs seem "normal" - if there was something strange going on with perfect separation or some singularity I would expect the model to fail to converge or the output to be wonky with respect to nonnecabd.

                  margins, dydx(*) atmeans:

                  Click image for larger version

Name:	Capture1.PNG
Views:	1
Size:	166.2 KB
ID:	1729019
                  Last edited by Shadassa Ourshalimian; 03 Oct 2023, 21:27.

                  Comment


                  • #10
                    You must have some extreme outliers. I'd make sure they are driving the coef estimates.

                    Comment

                    Working...
                    X