Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Selection bias

    Hi everyone,

    I'm a beginner in econometrics and I'm working on a study examining the association between CO₂ exposure and infant mortality using a pooled cross-sectional dataset, please advise me if my question is flawed..
    As my first study,
    I run a regression using reghdfe with fixed effects (assuming the exogeneity of CO₂ exposure) to estimate infant mortality. This stage includes several controls and fixed effects (e.g., country or year) to account for unobserved heterogeneity. I receive significant results in this.
    As the second study,
    I then regress CO₂ exposure on the weight-for-age of children. Here, I assume that the unobservables (error term) from the infant mortality regression drive the selection of children (i.e., only survivors are observed in the second stage).
    As I understand, by conditioning on survival, the sample for the second study is selected in a non random manner.
    I attempted to address this using a Heckman selection model, but I'm finding it extremely difficult to construct a valid instrument that affects survival (the selection process) without directly affecting weight-for-age.
    Are there alternative methods or strategies you can recommend to address or rationalize selection bias in this context especially when a valid instrument for the Heckman model is hard to come by?
    Is there any way to mathematically model the bias and show how much my estimates shift due to bias..
    I’d appreciate any insights, alternative suggestions, or relevant literature that could help me move forward.
    Thanks in advance for your help!

  • #2
    Manski Bounds, maybe.

    Comment


    • #3
      Prof. Ford Thank you so much for the suggestion on Manski Bounds.
      I will check on this for more information.
      Anymore insights are also appreciated if there are more along the way

      Comment


      • #4
        Might look to regional variations in Hospitals per sqmi or NICU beds per capita as IVs. These likely effect early results but not weight/age over time.

        but https://pmc.ncbi.nlm.nih.gov/articles/PMC10867701/.

        Drug use/opiod deaths maybe.

        SUID seems to vary a bit. https://www.cdc.gov/sudden-infant-de...-by-state.html
        Last edited by George Ford; 23 Mar 2025, 10:09.

        Comment


        • #5
          Thank you, Prof. Ford.
          Your suggestion is incredibly helpful!
          I truly appreciate the idea of using regional variations in hospitals per square mile and NICU beds per capita as IVs, along with the excellent references. Since this work is based in Africa where data availability can be a bit limited. I’ll look for the data and thank you again
          Last edited by Upeksha Diss; 23 Mar 2025, 18:36.

          Comment

          Working...
          X