Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Propensity Score to define sub population according to success individual characteristics

    Dear Stata users,

    Today I am contacting you for a question that causes me some concerns. Here is an example related to health: if one has a balanced panel data of a population that contains information about health status (represented by a dummy variable equal to 1 if the individual is seek and 0 otherwise) and several other variables such as employment, age, city of residence whether the individual has health insurance and so on.
    The whole population does not have the same risk to be seek and my interest here is to define a subgroup of the whole population that can be considered as being at risk of success (i.e. to develop one specific illness).
    I am wondering what sort of strategy can be used to define this sub population from the information available about individuals that experience success.
    My database is a two wave balanced panel with 2000 individuals and out of the 2000 individuals 33 individuals experience "success" (dummy = 1 for health status).

    The unique strategy that I have thought of so far is:
    - A propensity score can make it by using available information from the seek population to define a propensity score range between which corresponding individuals can be considered as at risk.
    With a code in Stata such as:
    pscore seek age origin employment_status city_residence insurance smoker ..., pscore(pscore) blockid(block) detail logit level(0.1) numblo(5)

    Then the bandwidth range is defined by two extreme values which are : the lowest propensity score of the individual that do experience success and the highest on.

    I haven't found much information about the validity of this approach, thus I am contacting you for advice about this specific issue.

    Best regards,

  • #2
    Marcel:
    if you state that those included i your sample do not have the same risk to fall sick, you are implicitly saying that you have information about their risk to become sick.
    If that were the case, you can plug in a predictor, that sounds to me as a categorical variable with, say, three different level of risk to become sick: high, moderate, low.
    You may want consider to use -xtlogit- if you're going to carry out a panel data regression; otherwise, I guess you may also think of survival analysis.
    Unfortunately (for me, at least), I'm not familiar with propensity score stuff.
    As a general remark, the evidence that only 33/2000 (1.65%) patients become sick makes hard hoping in interesting results.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo, thank you for this post.
      I see your point about the small proportion of success in the sample. I will follow you advice regarding xtlogit though.

      Best regards,

      Comment

      Working...
      X