Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Relative importance of predictors with fractional outcomes

    Dear all,
    I have a dataset with a set of fractional outcomes ranging from 0 to 1, summing up to 1 for each individual. I would like to do the following:
    1. Estimate the predictors for each fractional outcome (in absolute terms, not in relation to one specific referent fractional outcome)
    2. Compare the relative importance of predictors for each fractional outcome
    3. Compare the relative importance of predictors across fractional outcomes

    Suppose that in my example, the fractional outcomes are governing, safety, education, recreation, social and urbanplanning; the predictors are popdens and houseval. I decided to fit a fractional multinomial logistic regression model as below:

    Code:
    use http://fmwww.bc.edu/repec/bocode/c/citybudget.dta, clear
    
    local outcomes governing safety education recreation social urbanplanning
    
    fmlogit `outcomes', eta(houseval popdens)
    
    foreach v of local outcomes {
        margins, dydx(houseval popdens) predict(outcome(`v'))
    }
    This approach allows me to achieve aim #1, but I would kindly request help on:
    - how to establish the relative importance of houseval and popdens on, e.g., governing
    - how to establish the relative importance of houseval on governing and houseval on safety
    - how to establish the relative importance of houseval on governing and popdens on safety
    - how to quantify such relative importance (e.g. to be able to claim that houseval has double the effect on governing that popdens has, or that the effect of popdens on governing is 1.5 times that on recreation)

    I guess a SUEST/SUREG and/or a DOMIN/DOMME approach would be helpful (please consider that in my dataset houseval and popdens have different units of measurement), but I'm not sure how to implement them in practice. Maybe Joseph Luchman or other may help?

    Thanks,
    Manuel
    Last edited by Manuel Ferraro; 28 Aug 2022, 08:39.

  • #2
    deleted because not relevant

    Comment


    • #3
      Manuel,

      A dominance approach could be applied if 1) there is a model fit metric you trust that you can use to evaluate the model and 2) the structure of the command used fits -domin- (depvar indepvars structure) or -domme-'s (command accepts constraint-s) requirements. Usually, if the command does not work as is, you can devise a program to restructure the inputs to get the command to work (especially with -domin-).

      A couple of questions to think about relative to the needs stated here:

      2. Compare the relative importance of predictors for each fractional outcome
      Are you thinking of collapsing across outcomes for this relative importance determination? For instance, looking for the importance 'governing' across all outcomes relative to other predictive factors across all outcomes.

      3. Compare the relative importance of predictors across fractional outcomes
      Is rank order comparisons across of predictive factors across dependent variables in different models sufficient or are you in need of a decomposition of a fit metric across all estimated parameters in the model?
      Joseph Nicholas Luchman, Ph.D., PStat® (American Statistical Association)
      ----
      Research Fellow
      Fors Marsh

      ----
      Version 18.0 MP

      Comment


      • #4
        Hi Joseph, thank you for your reply.

        Are you thinking of collapsing across outcomes for this relative importance determination? For instance, looking for the importance 'governing' across all outcomes relative to other predictive factors across all outcomes.
        Not really. Actually, I would be primarily interested in ranking predictors for each individual outcome. I had thought about something like this:

        Code:
        use http://fmwww.bc.edu/repec/bocode/c/citybudget.dta, clear
        
        local outcomes governing safety education recreation social urbanplanning
        
        foreach v of local outcomes {
            domin `v' houseval popdens, reg(fracreg logit) fitstat(e(r2_p))
        }
        However, I think there are two problems with this approach: 1) ignoring that fractional outcomes are related with each other, 2) would still not be able to rank across outcomes

        Is rank order comparisons across of predictive factors across dependent variables in different models sufficient or are you in need of a decomposition of a fit metric across all estimated parameters in the model?
        I think the latter would be preferable (e.g. providing dominance statistics which in turn might allow me to compare and quantify the magnitude of importance across predictive factors and dependent variables)

        Comment


        • #5
          Given your response, I think a -domme-based approach like below should work:

          Code:
          . domme (eta_safety =  houseval popdens) (eta_recreation = houseval popdens) (eta_education =  houseval popdens) (eta_social = houseval popdens) (eta_urbanplanning = houseval p
          > opdens), reg(fmlogit governing safety education recreation social urbanplanning) ropt(eta(minorityleft noleft houseval popdens)) fitstat(e(), mcf)
          
          Total of 1023 models/regressions
          
          Progress in running all regression subsets
          0%------50%------100%
          ....................
          
          Computing conditional dominance
          
          Computing complete dominance
          
          General dominance statistics: ML fit of fractional multinomial logit
          Number of obs             =                     392
          Overall Fit Statistic     =                  0.0036
          
                      |      Dominance      Standardized      Ranking
                      |      Stat.          Domin. Stat.
          ------------+------------------------------------------------------------------------
          eta_safety  |
           houseval   |         0.0002      0.0529            6 
           popdens    |         0.0003      0.0850            4 
          eta_recre~n |
           houseval   |         0.0001      0.0184            8 
           popdens    |         0.0001      0.0172            9 
          eta_educa~n |
           houseval   |         0.0007      0.1809            3 
           popdens    |         0.0000      0.0138            10
          eta_social  |
           houseval   |         0.0012      0.3265            1 
           popdens    |         0.0007      0.1909            2 
          eta_urban~g |
           houseval   |         0.0002      0.0425            7 
           popdens    |         0.0003      0.0718            5 
          -------------------------------------------------------------------------------------
          Conditional dominance statistics
          -------------------------------------------------------------------------------------
          
                                       #param_ests:  #param_ests:  #param_ests:  #param_ests:  #param_ests:  #param_ests:  #param_ests:  #param_ests:  #param_ests:  #param_ests:
                                                 1             2             3             4             5             6             7             8             9            10
                 eta_safety:houseval        0.0003        0.0003        0.0003        0.0002        0.0002        0.0002        0.0001        0.0001        0.0001        0.0001
                  eta_safety:popdens        0.0005        0.0005        0.0005        0.0004        0.0004        0.0003        0.0002        0.0002        0.0001        0.0000
             eta_recreation:houseval        0.0000        0.0000        0.0000        0.0000        0.0001        0.0001        0.0001        0.0001        0.0001        0.0001
              eta_recreation:popdens        0.0001        0.0001        0.0001        0.0001        0.0001        0.0001        0.0001        0.0001        0.0001        0.0001
              eta_education:houseval        0.0005        0.0005        0.0005        0.0006        0.0006        0.0006        0.0007        0.0007        0.0008        0.0009
               eta_education:popdens        0.0000        0.0000        0.0000        0.0000        0.0000        0.0001        0.0001        0.0001        0.0001        0.0001
                 eta_social:houseval        0.0013        0.0012        0.0012        0.0012        0.0012        0.0011        0.0011        0.0011        0.0011        0.0011
                  eta_social:popdens        0.0008        0.0007        0.0007        0.0007        0.0007        0.0007        0.0007        0.0007        0.0006        0.0006
          eta_urbanplanning:houseval        0.0002        0.0002        0.0002        0.0002        0.0002        0.0002        0.0001        0.0001        0.0001        0.0001
           eta_urbanplanning:popdens        0.0001        0.0002        0.0002        0.0002        0.0002        0.0003        0.0003        0.0003        0.0004        0.0004
          -------------------------------------------------------------------------------------
          Complete dominance designation
          -------------------------------------------------------------------------------------
          
                                dominated?:  dominated?:  dominated?:  dominated?:  dominated?:  dominated?:  dominated?:  dominated?:  dominated?:  dominated?:
                                  houseval      popdens     houseval      popdens     houseval      popdens     houseval      popdens     houseval      popdens
          dominates?:houseval            0            0            0            0            0            0           -1           -1            0            0
           dominates?:popdens            0            0            0            0            0            0           -1           -1            0            0
          dominates?:houseval            0            0            0            0           -1            0           -1           -1            0            0
           dominates?:popdens            0            0            0            0           -1            0           -1           -1            0            0
          dominates?:houseval            0            0            1            1            0            1           -1            0            1            0
           dominates?:popdens            0            0            0            0           -1            0           -1           -1            0            0
          dominates?:houseval            1            1            1            1            1            1            0            0            1            1
           dominates?:popdens            1            1            1            1            0            1            0            0            1            1
          dominates?:houseval            0            0            0            0           -1            0           -1           -1            0            0
           dominates?:popdens            0            0            0            0            0            0           -1           -1            0            0
          -------------------------------------------------------------------------------------
          
          Strongest dominance designations
          
          eta_social:houseval completely dominates eta_safety:houseval
          eta_social:popdens completely dominates eta_safety:houseval
          eta_social:houseval completely dominates eta_safety:popdens
          eta_social:popdens completely dominates eta_safety:popdens
          eta_education:houseval completely dominates eta_recreation:houseval
          eta_social:houseval completely dominates eta_recreation:houseval
          eta_social:popdens completely dominates eta_recreation:houseval
          eta_education:houseval completely dominates eta_recreation:popdens
          eta_social:houseval completely dominates eta_recreation:popdens
          eta_social:popdens completely dominates eta_recreation:popdens
          eta_social:houseval completely dominates eta_education:houseval
          eta_education:houseval completely dominates eta_education:popdens
          eta_social:houseval completely dominates eta_education:popdens
          eta_social:popdens completely dominates eta_education:popdens
          eta_education:houseval completely dominates eta_urbanplanning:houseval
          eta_social:houseval completely dominates eta_urbanplanning:houseval
          eta_social:popdens completely dominates eta_urbanplanning:houseval
          eta_social:houseval completely dominates eta_urbanplanning:popdens
          eta_social:popdens completely dominates eta_urbanplanning:popdens
          eta_education:houseval conditionally dominates eta_safety:houseval
          eta_urbanplanning:popdens conditionally dominates eta_recreation:houseval
          eta_safety:houseval conditionally dominates eta_recreation:popdens
          eta_urbanplanning:houseval conditionally dominates eta_recreation:popdens
          eta_urbanplanning:popdens conditionally dominates eta_recreation:popdens
          eta_recreation:houseval conditionally dominates eta_education:popdens
          eta_urbanplanning:houseval conditionally dominates eta_education:popdens
          eta_urbanplanning:popdens conditionally dominates eta_education:popdens
          eta_social:houseval conditionally dominates eta_social:popdens
          eta_education:houseval conditionally dominates eta_urbanplanning:popdens
          eta_safety:popdens generally dominates eta_safety:houseval
          eta_urbanplanning:popdens generally dominates eta_safety:houseval
          eta_education:houseval generally dominates eta_safety:popdens
          eta_safety:houseval generally dominates eta_recreation:houseval
          eta_safety:popdens generally dominates eta_recreation:houseval
          eta_urbanplanning:houseval generally dominates eta_recreation:houseval
          eta_safety:popdens generally dominates eta_recreation:popdens
          eta_recreation:houseval generally dominates eta_recreation:popdens
          eta_social:popdens generally dominates eta_education:houseval
          eta_safety:houseval generally dominates eta_education:popdens
          eta_safety:popdens generally dominates eta_education:popdens
          eta_recreation:popdens generally dominates eta_education:popdens
          eta_safety:houseval generally dominates eta_urbanplanning:houseval
          eta_safety:popdens generally dominates eta_urbanplanning:houseval
          eta_urbanplanning:popdens generally dominates eta_urbanplanning:houseval
          eta_safety:popdens generally dominates eta_urbanplanning:popdens
          With this result, you have a full decomposition of the McFadden R^2 produced by -domme- and can use these values to compare within and across equations.
          Joseph Nicholas Luchman, Ph.D., PStat® (American Statistical Association)
          ----
          Research Fellow
          Fors Marsh

          ----
          Version 18.0 MP

          Comment


          • #6
            Thank you so much Joseph, this is very helpful. I only have a question (which was also my original issue with fmlogit): how to obtain the estimates for governing? I'm not interested in one dependent variable with respect to another, but to all dependent variables individually. In a typical fmlogit analysis, I would follow-up with dfmlogit (or margins, dydx(*) predict(outcome(governing)) as in my first example), but is it possible to use domme with the results of dfmlogit?

            Thanks again,
            Manuel

            Comment


            • #7
              As an extension of the -mlogit- command to fractional outcomes, one category's fraction must serve as the comparison just as one category in the -mlogit- must serve as the baseline comparison. Thus it is not possible to obtain parameter estimates for all outcome fractions. For this analysis (as with the example of the standard -mlogit- in the Stata Journal example provided for -domme-), will have to choose the baseline carefully as fit for the parameters will be relative to it.

              That's not the case with margins as they're based on changes in predicted values and you can get predicted values for all fractions.

              It should be possible to separate the predicted values by outcome and evaluate their accuracy on an outcome-by-outcome basis using Shapley value decomposition/dominance, but is not likely to work with -domin- or -domme- directly as implemented currently as they both assume a scalar returned value (not a matrix or vector). Also, not clear (to me) that it would tell a meaningfully different story (as the omitted category's predicted value is 1 less than the sum of the other categories).
              Joseph Nicholas Luchman, Ph.D., PStat® (American Statistical Association)
              ----
              Research Fellow
              Fors Marsh

              ----
              Version 18.0 MP

              Comment


              • #8
                Yes, I agree with you that wouldn't tell a significantly different story, however picking a baseline outcome would be difficult to justify and not well accepted by peers in the field. I guess I'll have to stick with a series of dominance analyses on fracreg outputs.

                Thanks for your help,
                Manuel

                Comment

                Working...
                X