Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • bootstrap s.e. with different oaxaca_rif models

    Dear all,

    I am using user written command oaxaca_rif by Rios-Avila 2020 based on FFL methodology (references below). I have many oaxaca_rif commands, and I use bootstrap st. errors with set seed option. My standard errors results differ slightly when I run from the begining do file, or when I run twice the same command in active do file.

    I want that all my results are replicable regardless the do file is run from the begining or not. Even if I run the same do file from the begining twice results differ. I tried with set seed option before command and inside command.

    Here is example:
    bootstrap, reps(200) seed (10209): oaxaca_rif y $demo $emp , by(female) wgt(1) rif(mean)
    bootstrap, reps(200) seed (10209): oaxaca_rif y $demo $emp , by(female) wgt(1) rif(q(10))

    I want to have the same standard errors regardless I run first for mean or for q(10). Where should I put set seed option in order to produce results irrespective of the order?

    I have data for two years, I run with and without reweighting, along the wage distribution, I run rifreg and oaxaca_rif.

    Thank you in advance for your answer.

    Rios-Avila, F. 2020. Recentered influence functions (RIFs) in Stata: RIF regression and RIF decomposition. Stata Journal, 20(1), 51-94. https://doi.org/10.1177/1536867X20909690.
    Firpo, S. P., N. M. Fortin, and T. Lemieux. 2018. Decomposing wage distributions using recentered influence function regressions. Econometrics 6: 28.
    https://doi.org/10.3390/econometrics6020028.

  • #2
    Hi Alexandra,
    I cannot replicate your problem.
    so a couple of questions.
    1. do you have the latest versions of the programs from SSC?
    2. can you replicate the following:
    Code:
    use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear
     bootstrap, seed(1):oaxaca_rif lnwage educ exper tenure, by(female) wgt(1) rif(q(10))
    
    This is what I get
    
    Bootstrap replications (50)
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 
    ..................................................    50
    Model  : Blinder-Oaxaca RIF-decomposition
    Type   : Standard
    RIF    : q(10)
    Scale  : 1
    Group 1: female = 0 x1*b1                        N of obs 1      = 751
    Group c: x2*b1                                   N of obs C      =          .
    Group 2: female = 1 x2*b2                        N of obs 2      = 683
    
    ------------------------------------------------------------------------------
                 |   Observed   Bootstrap                         Normal-based
          lnwage | coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    overall      |
         group_1 |   3.013147   .0237284   126.98   0.000     2.966641    3.059654
         group_2 |   2.764507   .0459995    60.10   0.000      2.67435    2.854664
      difference |   .2486405   .0461566     5.39   0.000     .1581753    .3391058
       explained |   .0844322   .0195587     4.32   0.000     .0460979    .1227665
     unexplained |   .1642084   .0512007     3.21   0.001     .0638568      .26456
    -------------+----------------------------------------------------------------
    explained    |
            educ |   .0432243   .0124765     3.46   0.001     .0187708    .0676778
           exper |   .0321231   .0100047     3.21   0.001     .0125143     .051732
          tenure |   .0090848   .0079339     1.15   0.252    -.0064654     .024635
    -------------+----------------------------------------------------------------
    unexplained  |
            educ |   .0373505   .3936258     0.09   0.924     -.734142     .808843
           exper |  -.0067576   .1167978    -0.06   0.954    -.2356772     .222162
          tenure |   .0492112   .0485189     1.01   0.310    -.0458842    .1443066
           _cons |   .0844043   .5146005     0.16   0.870    -.9241941    1.093003
    ------------------------------------------------------------------------------
    Thank you

    Comment


    • #3
      Dear Fernando,

      Here is my output. As far as I see, in unexplained part, two standard errors differ at the last decimal. I have your command since long time ago, and I am using it intensively last months. I don't know if it is updated. I can try to update it.

      Also, please find attached Excel with slightly different s.e. in Excel file for my research.

      Thenk you very much for your immediate response.

      Bootstrap replications (50)
      ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
      .................................................. 50
      Model : Blinder-Oaxaca RIF-decomposition
      Type : Standard
      RIF : q(10)
      Scale : 1
      Group 1: female = 0 x1*b1 N of obs 1 = 751
      Group c: x2*b1 N of obs C = .
      Group 2: female = 1 x2*b2 N of obs 2 = 683

      ------------------------------------------------------------------------------
      | Observed Bootstrap Normal-based
      lnwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------+----------------------------------------------------------------
      overall |
      group_1 | 3.013147 .0237284 126.98 0.000 2.966641 3.059654
      group_2 | 2.764507 .0459995 60.10 0.000 2.67435 2.854664
      difference | .2486405 .0461566 5.39 0.000 .1581753 .3391058
      explained | .0844322 .0195587 4.32 0.000 .0460979 .1227665
      unexplained | .1642084 .0512007 3.21 0.001 .0638567 .26456
      -------------+----------------------------------------------------------------
      explained |
      educ | .0432243 .0124765 3.46 0.001 .0187708 .0676778
      exper | .0321231 .0100047 3.21 0.001 .0125143 .051732
      tenure | .0090848 .0079339 1.15 0.252 -.0064654 .024635
      -------------+----------------------------------------------------------------
      unexplained |
      educ | .0373505 .3936259 0.09 0.924 -.7341421 .808843
      exper | -.0067576 .1167979 -0.06 0.954 -.2356772 .222162
      tenure | .0492112 .0485189 1.01 0.310 -.0458842 .1443066
      _cons | .0844043 .5146005 0.16 0.870 -.9241942 1.093003
      -----------------------------------------------------------------------------
      Attached Files

      Comment


      • #4
        Are you working on a Mac or a PC?
        I know that there is a small numerical difference when you work with both machines.

        What happens if you use the same command multiple times. Do you replicate standard errors? or do you different things?

        Comment


        • #5
          I use always use PC. When I run this twice (your data), I get the same results in all four cases. In my data when I run twice in the same active do file, I get the same results, but when I run again from the begining I get different s.e. for some observations. Please find attached both excel files.

          use http://fmwww.bc.edu/RePEc/bocode/o/oaxaca.dta, clear

          bootstrap, seed(1):oaxaca_rif lnwage educ exper tenure, by(female) wgt(1) rif(q(10))
          outreg2 using exmple.out, lab dec(3) ctitle(10) onecol append excel auto(3)

          bootstrap, seed(1):oaxaca_rif lnwage educ exper tenure, by(female) wgt(1) rif(q(10))
          outreg2 using exmple.out, lab dec(3) ctitle(10) onecol append excel auto(3)

          ////////////////////////////////

          ***note I do not have set seed before this command.

          bootstrap, reps(200) seed (10209): oaxaca_rif lyhw $demo $emp ///
          if lyhw!=. & rb010==2014 & wep==1 & exclusion==0 & occ_exclusion!=1, by(female) wgt(1) rif(mean)
          outreg2 using OBRIF14_NOR_detail_final_y.out, lab dec(3) ctitle(mean 2014) onecol append excel auto(3)

          bootstrap, reps(200) seed (10209): oaxaca_rif lyhw $demo $emp ///
          if lyhw!=. & rb010==2014 & wep==1 & exclusion==0 & occ_exclusion!=1, by(female) wgt(1) rif(mean)
          outreg2 using OBRIF14_NOR_detail_final_y.out, lab dec(3) ctitle(mean 2014) onecol append excel auto(3)


          Thank you for your answer.
          By the way, how you update it from ssc, becuase when I type ssc install oaxaca_rif it is written that no comman found. ssc is not for user written commands?

          Best,
          Aleksandra
          Attached Files

          Comment


          • #6
            Seems that it is all part of RIF package
            so just type
            ssc install rif, replace

            regarding replicability. I think the only other factor that could be explaining this differences in Standard errors is sorting.
            Check on your dofile and see if you do any sorting. If you do, you will have to change that line of code such that you have some Stable sorting.
            HTH

            Comment


            • #7
              Thank you very much. I have many sort commands, since i use bys idhh: egen ..... commands previously. And also I merge several databases at the begining of the do file. Could you please give me some suggestion how to sort data before running oaxaca_rif command?

              Best,
              Aleksandra

              Comment


              • #8
                ok one trick you could use.
                create an index variable when you open your data:
                use data, clear
                gen double index = _n

                *if any other merge adds observations to your dataset, create other Indices
                merge ...
                gen double index2 = _n

                finally, before running OB sort the data by the indices
                sort index index2 index....
                That should work.

                Fernando

                Comment


                • #9
                  Thank you very much for your help. I solved the problem, I tried it 8 times and s.e. are the same. I just have put before oaxaca_rif sorting so year idhousehold idperson.

                  Thanks once again.

                  Best,
                  Aleksandra

                  Comment

                  Working...
                  X