Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with decomposition of probabilites

    Dear all,

    I am trying to replicate the methodology from the following paper (see Section 3. A Multinomial Probability Model of Income Distribution, pg 6):
    http://unpan1.un.org/intradoc/groups...npan048358.pdf

    From what I could gather, I am supposed to use -margins- after -mlogit-. But I am completely lost how to separate the difference into a characteristics effect and a discrimination effect by creating counterfactual distributions, where one group has another another group's characteristics and coefficients respectively.

    Sample code:
    Code:
    sysuse auto
    mlogit rep78 foreign price displacement gear_ratio weight
    The outcome variable has 5 categories, and the group variable is foreign.

    Any help would be much appreciated.

    Thanks.

  • #2
    Hi Alina
    What you need to do to replicate the paper you provided a link for is more complicated than what you are currently doing.
    Below is a small code that implements a basic version of what you are trying to replicate.

    Code:
    use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
    xtile q5=lnwage, n(5)
    drop if lnwage==.
    mlogit q5 educ exper tenure age agesq if female==1
    predict pf*
    mlogit q5 educ exper tenure age agesq if female==0
    predict pm*
    
    
    mean pf* if female==1, 
    est store m1
    mean pf* if female==0
    est store m2a
    mean  pm* if female==1
    est store m2b
    mean pm* if female==0
    est store m3
    
    ** This table has the results you are looking for
    **m1 are the predicted probabilities of women to be in any of the income quantiles.
    **m2a are the predicted probabilities of men to be in any of the income quantiles, using women coefficients
    **m3 are the predicted probabilities of men to be in any of the income quantiles.
    **differences between m1 and m2a are due to characteristics,
    **differences between m2a and m3 are due to differences in coefficients
    est tab m1 m2a m3,  nose nostar not 
    
    -----------------------------------------------------
        Variable |     m1          m2a           m3      
    -------------+---------------------------------------
             pf1 |   .2781845    .23494095               
             pf2 |  .24304539    .22699103               
             pf3 |  .16983894    .19630346               
             pf4 |  .16691068     .1812393               
             pf5 |   .1420205    .16052526               
             pm1 |                            .12916112  
             pm2 |                             .2183755  
             pm3 |                            .19174434  
             pm4 |                            .21038615  
             pm5 |                            .25033289  
    -----------------------------------------------------
    ** This is basically the same as above, but the counterfactual is different
    est tab m1 m2b m3,  nose nostar not 
    
    -----------------------------------------------------
        Variable |     m1          m2b           m3      
    -------------+---------------------------------------
             pf1 |   .2781845                            
             pf2 |  .24304539                            
             pf3 |  .16983894                            
             pf4 |  .16691068                            
             pf5 |   .1420205                            
             pm1 |               .11402641    .12916112  
             pm2 |               .20694076     .2183755  
             pm3 |               .19885544    .19174434  
             pm4 |               .22126638    .21038615  
             pm5 |               .25891101    .25033289  
    -----------------------------------------------------
    This is the basic structure of the method you are looking for, but needs a bit more work to obtain standard errors, and derive the actual decomposition.
    HTH
    Fernando

    Comment


    • #3
      Originally posted by FernandoRios View Post
      Hi Alina
      What you need to do to replicate the paper you provided a link for is more complicated than what you are currently doing.
      Below is a small code that implements a basic version of what you are trying to replicate.

      Code:
      use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
      xtile q5=lnwage, n(5)
      drop if lnwage==.
      mlogit q5 educ exper tenure age agesq if female==1
      predict pf*
      mlogit q5 educ exper tenure age agesq if female==0
      predict pm*
      
      
      mean pf* if female==1,
      est store m1
      mean pf* if female==0
      est store m2a
      mean pm* if female==1
      est store m2b
      mean pm* if female==0
      est store m3
      
      ** This table has the results you are looking for
      **m1 are the predicted probabilities of women to be in any of the income quantiles.
      **m2a are the predicted probabilities of men to be in any of the income quantiles, using women coefficients
      **m3 are the predicted probabilities of men to be in any of the income quantiles.
      **differences between m1 and m2a are due to characteristics,
      **differences between m2a and m3 are due to differences in coefficients
      est tab m1 m2a m3, nose nostar not
      
      -----------------------------------------------------
      Variable | m1 m2a m3
      -------------+---------------------------------------
      pf1 | .2781845 .23494095
      pf2 | .24304539 .22699103
      pf3 | .16983894 .19630346
      pf4 | .16691068 .1812393
      pf5 | .1420205 .16052526
      pm1 | .12916112
      pm2 | .2183755
      pm3 | .19174434
      pm4 | .21038615
      pm5 | .25033289
      -----------------------------------------------------
      ** This is basically the same as above, but the counterfactual is different
      est tab m1 m2b m3, nose nostar not
      
      -----------------------------------------------------
      Variable | m1 m2b m3
      -------------+---------------------------------------
      pf1 | .2781845
      pf2 | .24304539
      pf3 | .16983894
      pf4 | .16691068
      pf5 | .1420205
      pm1 | .11402641 .12916112
      pm2 | .20694076 .2183755
      pm3 | .19885544 .19174434
      pm4 | .22126638 .21038615
      pm5 | .25891101 .25033289
      -----------------------------------------------------
      This is the basic structure of the method you are looking for, but needs a bit more work to obtain standard errors, and derive the actual decomposition.
      HTH
      Fernando
      Thank you for being a lifesaver once again, Mr Fernando.

      If I understand correctly, the first has females as the reference category and the second one has males?

      Would you please kindly elaborate what you mean by actual decomposition?

      Comment


      • #4
        It depends on what do you understand as reference category. I prefer not to use the "reference" category language, because it confuses me. However, based on the way Borooah(2005) uses the method, yes. You are correct. In the first case women are used as reference category.
        For the actual decomposition i mean to obtain the differences as follows:
        Say, Using the first set of numbers, and referring only the the first quintile. There is a 14.9% difference in the share of people that belong to the first quintile, comparing women's to Men's distribution.
        10.58pp (23.5-12.9) are due to differences in coefficients, and 4.32pp [(27.8-23.5)] due to differences in coefficients.

        Comment


        • #5
          Originally posted by FernandoRios View Post
          It depends on what do you understand as reference category. I prefer not to use the "reference" category language, because it confuses me. However, based on the way Borooah(2005) uses the method, yes. You are correct. In the first case women are used as reference category.
          For the actual decomposition i mean to obtain the differences as follows:
          Say, Using the first set of numbers, and referring only the the first quintile. There is a 14.9% difference in the share of people that belong to the first quintile, comparing women's to Men's distribution.
          10.58pp (23.5-12.9) are due to differences in coefficients, and 4.32pp [(27.8-23.5)] due to differences in coefficients.
          Thanks a lot. And if I am using survey weights, I should just replace with -svy: mlogit- and -svy: means-, right?

          Comment


          • #6
            That is the part that makes this more complicated, unfortunately.
            If you are using survey data, you should use weights. But, there is no easy way to obtain standard errors. That is why the paper you provided, does not report standard errors.

            Comment


            • #7
              Originally posted by FernandoRios View Post
              That is the part that makes this more complicated, unfortunately.
              If you are using survey data, you should use weights. But, there is no easy way to obtain standard errors. That is why the paper you provided, does not report standard errors.
              I see that now. Thank you so much, once again!

              Comment


              • #8
                Originally posted by FernandoRios View Post
                Hi Alina
                What you need to do to replicate the paper you provided a link for is more complicated than what you are currently doing.
                Below is a small code that implements a basic version of what you are trying to replicate.

                Code:
                use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
                xtile q5=lnwage, n(5)
                drop if lnwage==.
                mlogit q5 educ exper tenure age agesq if female==1
                predict pf*
                mlogit q5 educ exper tenure age agesq if female==0
                predict pm*
                
                
                mean pf* if female==1,
                est store m1
                mean pf* if female==0
                est store m2a
                mean pm* if female==1
                est store m2b
                mean pm* if female==0
                est store m3
                
                ** This table has the results you are looking for
                **m1 are the predicted probabilities of women to be in any of the income quantiles.
                **m2a are the predicted probabilities of men to be in any of the income quantiles, using women coefficients
                **m3 are the predicted probabilities of men to be in any of the income quantiles.
                **differences between m1 and m2a are due to characteristics,
                **differences between m2a and m3 are due to differences in coefficients
                est tab m1 m2a m3, nose nostar not
                
                -----------------------------------------------------
                Variable | m1 m2a m3
                -------------+---------------------------------------
                pf1 | .2781845 .23494095
                pf2 | .24304539 .22699103
                pf3 | .16983894 .19630346
                pf4 | .16691068 .1812393
                pf5 | .1420205 .16052526
                pm1 | .12916112
                pm2 | .2183755
                pm3 | .19174434
                pm4 | .21038615
                pm5 | .25033289
                -----------------------------------------------------
                ** This is basically the same as above, but the counterfactual is different
                est tab m1 m2b m3, nose nostar not
                
                -----------------------------------------------------
                Variable | m1 m2b m3
                -------------+---------------------------------------
                pf1 | .2781845
                pf2 | .24304539
                pf3 | .16983894
                pf4 | .16691068
                pf5 | .1420205
                pm1 | .11402641 .12916112
                pm2 | .20694076 .2183755
                pm3 | .19885544 .19174434
                pm4 | .22126638 .21038615
                pm5 | .25891101 .25033289
                -----------------------------------------------------
                This is the basic structure of the method you are looking for, but needs a bit more work to obtain standard errors, and derive the actual decomposition.
                HTH
                Fernando
                Just in case anyone is interested, building on the above, here's how I did it:
                Code:
                use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
                xtile q5=lnwage, n(5)
                drop if lnwage==.
                mlogit q5 educ exper tenure age agesq if female==1
                predict pf*
                mlogit q5 educ exper tenure age agesq if female==0
                predict pm*
                
                mean pf* if female==1,
                matrix m1=e(b)'
                mean pf* if female==0
                matrix m2a=e(b)'
                mean pm* if female==1
                matrix m2b=e(b)'
                mean pm* if female==0
                matrix m3=e(b)'
                
                //Using female as reference category
                *Total Difference
                matrix tf=m1-m3
                matrix list tf
                
                *Characteristics effect
                matrix cf=m1-m2a
                matrix list cf
                
                *Coefficients effect
                matrix df=m2a-m3
                matrix list df
                
                //Using male as reference category
                *Total Difference
                matrix tm=m3-m1
                matrix list tm
                
                *Characteristics effect
                matrix cm=m3-m2b
                matrix list cm
                
                *Coefficients effect
                matrix dm=m2b-m1
                matrix list dm
                There might be an easier way to do this that I do not know of, though.

                And thanks once again to Mr Fernando!

                Comment

                Working...
                X