Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Not able to replicate the exact values in tables

    replication.dta
    Greetings!
    I have 2 questions:
    1. I intend to replicate the table below:
    Click image for larger version

Name:	Screenshot 2022-12-13 at 9.55.22 PM.png
Views:	2
Size:	133.0 KB
ID:	1693376



    Codes I ran:

    gen Mentee = 1 if treat == 4
    replace Mentee = 0 if treat == 2 | treat == 3
    gen Class = 1 if treat == 3
    replace Class = 0 if treat == 2 | treat == 4
    global Controls I_emp_b lage_b sec0_b sec1_b sec2_b sec3_b sec4_b secondaryedu_b
    reg tprofits Mentee Class tprofits_b $Controls, cluster(id) robust
    test Mentee=Class
    return list
    estadd scalar pval=r(p)
    estadd local Controls "Yes"
    est sto tab2_col1
    reg tprofits Mentee Class tprofits_b $Controls if months_since_treat == 1, cluster(id) robust
    test Mentee=Class
    return list
    estadd scalar pval=r(p)
    estadd local Controls "Yes"
    est sto tab2_col2
    .
    .
    .
    esttab tab2_col* using table2.rtf, replace label b(3) se(3) stats(pval N r2 Controls, fmt(3 0 3 0) labels("p-value, H0: M = C" "Observations" "R2" "Controls")) keep (Mentee Class _cons) nomtitle onecell


    (and got the table below)
    Click image for larger version

Name:	Screenshot 2022-12-13 at 9.59.20 PM.png
Views:	1
Size:	243.6 KB
ID:	1693377



    Although the beta co-efficients and standard errors are almost similar, the intercepts and column 1 estimates are not. Can anyone please tell me why?

    2. To replicate the figure below:
    Click image for larger version

Name:	Screenshot 2022-12-13 at 10.01.30 PM.png
Views:	1
Size:	161.5 KB
ID:	1693378



    I used the following commands:

    preserve
    collapse (mean) tprofits, by (months_since_treat Mentor Class)
    twoway (connected tprofits months_since_treat if Mentor==1) (connected tprofits months_since_treat if Class==1) (connected tprofits months_since_treat if Mentor==0 & Class==0),xlabel(-2 0 2 4 6 8 10 12 14 16 18)
    restore

    (and got the following figure)

    Click image for larger version

Name:	Screenshot 2022-12-13 at 10.03.35 PM.png
Views:	1
Size:	361.9 KB
ID:	1693379


    However, we see that the connected green and red lines are missing for periods -2 to 1. Is it because I did not include the thick grey bar that highlights when the intervention took place?

    Thanks in advance for your time!
    Attached Files

  • #2
    Abhinaya:
    1) I cannot say about the graphs;
    2) your model (1) seems to suffer from some probably trivial mistake. Double-check your data;
    3) you seem to have a decimal digits issue, too, that is easy to fix:
    Code:
    . use "C:\Program Files\Stata17\ado\base\a\auto.dta", clear
    (1978 automobile data)
    
    . format gear_ratio %12.3f
    
    . list gear_ratio in 1/5
    
         +----------+
         | gear_r~o |
         |----------|
      1. |    3.580 |
      2. |    2.530 |
      3. |    3.080 |
      4. |    2.930 |
      5. |    2.410 |
         +----------+
    
    . format gear_ratio %12.2f
    
    . list gear_ratio in 1/5
    
         +----------+
         | gear_r~o |
         |----------|
      1. |     3.58 |
      2. |     2.53 |
      3. |     3.08 |
      4. |     2.93 |
      5. |     2.41 |
         +----------+
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Carlo, thank you so much for the reply! I have fixed the decimal issues, but please can you tell me why column one results and the intercepts (black table) do not match my target (white table)? PFA the data. Should I account for time fixed effects? If so how do I do it?
      Thanks inn advance!

      Comment


      • #4
        1. My guess is that you aren't replicating the pooled column correctly; or that the original authors made a mistake.

        2. For the other columns, if you drop the cluster(id) robust options the standard errors match what was reported in your white table.

        3. What does "Control mean" mean in the white table? You seem to think it is the constant, but are you sure of that?

        4. Do you have a citation for the white table? If you check the methods section, you might be able to figure out what the authors did differently than what you are doing. Then again, maybe not. The authors may have left out key details, or even made mistakes themselves when they reported the results. But if you can give the citation it might be possible to identify the cause of the discrepancies.
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Richard, thank you for your reply!
          1. Yes, I am not getting the right results for the pooled regression. I will share the citation below.
          2. Oh yes! I got it. But the authors have mentioned that they have clustered standard errors at the individual level.
          3. The authors are trying to analyse the impact of formal business training classes versus local mentorship on the profits of Microenterprise entrepreneurs. So I am guessing control mean to be the the intercept, those who were assigned neither mentorship nor class status.
          4. PFA the citation below for your reference.
          Click image for larger version

Name:	Screenshot 2022-12-14 at 2.42.02 PM.png
Views:	1
Size:	254.7 KB
ID:	1693490


          Click image for larger version

Name:	Screenshot 2022-12-14 at 2.42.43 PM.png
Views:	1
Size:	221.6 KB
ID:	1693491


          Click image for larger version

Name:	Screenshot 2022-12-14 at 2.43.01 PM.png
Views:	1
Size:	313.1 KB
ID:	1693492


          Click image for larger version

Name:	Screenshot 2022-12-14 at 2.43.27 PM.png
Views:	1
Size:	381.7 KB
ID:	1693493



          Thanks in advance for your time1

          Comment


          • #6
            Key phrase from the article, referring to table 3: "Standard errors for pooled regressions are clustered at the individual level and include wave fixed effects"

            You aren't including wave fixed effects in your current pooled code; and clustering/robust errors should not be included for individual waves. Tweaking your code.

            Code:
            * wave is added
            gen wave = months_since_treat + 2
            gen Mentee = 1 if treat == 4
            replace Mentee = 0 if treat == 2 | treat == 3
            gen Class = 1 if treat == 3
            replace Class = 0 if treat == 2 | treat == 4
            global Controls I_emp_b lage_b sec0_b sec1_b sec2_b sec3_b sec4_b secondaryedu_b
            reg tprofits Mentee Class tprofits_b $Controls i.wave, cluster(id) robust
            test Mentee=Class
            return list
            estadd scalar pval=r(p)
            estadd local Controls "Yes"
            est sto tab2_col1
            
            reg tprofits Mentee Class tprofits_b $Controls if months_since_treat == 1,
            test Mentee=Class
            return list
            estadd scalar pval=r(p)
            estadd local Controls "Yes"
            est sto tab2_col2
            
            esttab tab2_col*, se
            Final esttab output:

            Code:
            . esttab tab2_col*, se
            
            --------------------------------------------
                                  (1)             (2)   
                             tprofits        tprofits   
            --------------------------------------------
            Mentee              371.8**         266.1   
                              (133.0)         (203.4)   
            
            Class               124.5           193.7   
                              (144.9)         (200.9)   
            
            tprofits_b          0.248***        0.149** 
                             (0.0332)        (0.0565)   
            
            I_emp_b             441.5*          297.9   
                              (170.6)         (220.4)   
            
            lage_b              316.1           552.1   
                              (283.8)         (490.9)   
            
            sec0_b              327.7          -19.37   
                              (264.3)         (493.3)   
            
            sec1_b              97.64           126.8   
                              (316.9)         (640.6)   
            
            sec2_b              4.356           197.1   
                              (264.5)         (478.6)   
            
            sec3_b              141.9          -331.0   
                              (274.5)         (536.1)   
            
            sec4_b             -447.1          -172.9   
                              (314.9)        (1613.9)   
            
            secondarye~b        163.9          -44.78   
                              (95.08)         (167.2)   
            
            0.wave                  0                   
                                  (.)                   
            
            3.wave             -131.4                   
                              (125.1)                   
            
            4.wave             -589.0***                
                              (128.6)                   
            
            5.wave             -85.23                   
                              (139.0)                   
            
            6.wave             -37.31                   
                              (141.4)                   
            
            9.wave              733.3***                
                              (163.0)                   
            
            14.wave            -423.1***                
                              (124.3)                   
            
            19.wave             66.49                   
                              (149.1)                   
            
            _cons              -96.61          -443.8   
                             (1031.2)        (1762.8)   
            --------------------------------------------
            N                    2578             345
            The first 2 columns now match what is reported in the paper. I don't know how "Control mean" was calculated, but it does not appear to be the constant term.

            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            StataNow Version: 19.5 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #7
              Dear Richard,
              Thank you so much for your reply! The codes worked perfectly for me and I got my intended results. May be I am ignorant, but I found one part of the research paper which says that 'Control mean' refers to the constant term. Please find attached below for your kind reference.
              Click image for larger version

Name:	Screenshot 2022-12-14 at 5.38.01 PM.png
Views:	1
Size:	242.0 KB
ID:	1693552


              Please see the bottom of the table: Column 1 (Control mean) is alpha hat.

              Thanks in advance for your time

              Comment


              • #8
                Sorry, it isn't obvious to me what they are doing. Why don't you just get the current addresses of the authors and ask them?

                As luck would have it, when they wrote the article 2 of the authors had offices on the floor below mine! But I think they've gone elsewhere now.
                -------------------------------------------
                Richard Williams, Notre Dame Dept of Sociology
                StataNow Version: 19.5 MP (2 processor)

                EMAIL: [email protected]
                WWW: https://www3.nd.edu/~rwilliam

                Comment


                • #9
                  Dear Richard,
                  Thank you so much for taking time in viewing through my problem and helping me with the same! I truly appreciate your suggestion, I shall email one of them.
                  I just realised that when I run the following:

                  Code:
                  reg tprofits Mentee Class
                  reg tprofits Mentee Class if months_since_treat == 1
                  reg tprofits Mentee Class if months_since_treat == 2
                  .
                  .
                  .

                  I get somewhat close intercept values as presented by the authors, but not the beta co-efficients and the standard errors.
                  Last edited by Abhinaya Kanakasabai; 14 Dec 2022, 13:48.

                  Comment


                  • #10
                    I wonder if the control means are based on slightly different cases than those that got used in the regression analyses. e.g. They might have had 2,700 records initially and 122 got dropped because of missing data in the regressions, leaving 2,578. But, the control means may have been based on the original 2,700 records. That would be my first guess as to why the control means seem a little off from what you can estimate using the sample.
                    -------------------------------------------
                    Richard Williams, Notre Dame Dept of Sociology
                    StataNow Version: 19.5 MP (2 processor)

                    EMAIL: [email protected]
                    WWW: https://www3.nd.edu/~rwilliam

                    Comment


                    • #11
                      Do you mean the line marked in red?


                      Click image for larger version

Name:	Screenshot 2022-12-14 at 9.40.43 PM.png
Views:	1
Size:	403.3 KB
ID:	1693595

                      Comment


                      • #12
                        If I am correct, the missing data wouldn’t be from cases that weren’t included at all. Rather, they would be cases who had data for some variables but were missing from others.

                        But, this is just speculation that the authors could perhaps confirm or disconfirm.

                        How did you get the replication data set in the first place? Is there related data you might be able to retrieve? There does not seem to be any MD in your replication data set, and it may be a subset of a larger data set that does have have MD, and which was used to compute the Control means.
                        -------------------------------------------
                        Richard Williams, Notre Dame Dept of Sociology
                        StataNow Version: 19.5 MP (2 processor)

                        EMAIL: [email protected]
                        WWW: https://www3.nd.edu/~rwilliam

                        Comment


                        • #13
                          The dataset was posted by our Professor. Thank you so much for sharing your knowledge! I learned a lot from you, Richards

                          Comment

                          Working...
                          X