Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding significance asterisks when using frames and reghdfe

    Hi,
    I have a forval and if loop that together loop over multiple regressions with 55 different conditions, exporting the results through a temporary frame to a large excel table. I am using the reghdfe package, as reg has proven too slow for the clusters and fixed effects that I have incorporated in the regression models.
    I want to automatically add an asterisks or other marker to regression results that are significant at the 0.1 level. However, the only way I have found to do this is through the outreg command, which doesn't seem to work in this format. Here is an example of the loop with two regression models. `"`condition`c''"' designates an alternating local condition.

    Code:
    forval c= 1/54{
        capture noisily: corr var1 var2 if `condition`c''
        if inlist(c(rc), 2000,2001){
            frame frame1: replace Variable1= "Insufficient Observations"  in `c'
            frame frame1: replace Conditions= `"`condition`c''"' in `c'
        }
        if c(rc)==0{
            frame frame1: replace Variable1= word("`:colnames r(C)'", 1)  in `c'
            frame frame1: replace Variable2= word("`:colnames r(C)'", 2)  in `c'
            frame frame1: replace Conditions= `"`condition`c''"' in `c'
            corr var1 var2 if `condition`c''
            frame frame1: replace Corr= r(rho) in `c'
                    
            reghdfe var1 var2 if `condition`c'' , ab (i.yearmoed) vce(cluster semelmos)
            frame frame1: replace Reg_YearFE= e(b)[1,1] in `c'
            frame frame1: replace Std_Err_YearFE= r(table)[2,1] in `c'
            
            reghdfe var1 var2 if `condition`c'' , ab (i.sheelon) vce(cluster semelmos)
            frame frame1: replace Reg_SheelFE= e(b)[1,1] in `c'
            frame frame1: replace Std_Err_SheelFE= r(table)[2,1] in `c'
            }
       }
      
    
    frame change frame1
    cd"C:\Users\User\Documents\Research\School Project\Results\Tables and Output\Result_tables"
    export excel using results.xls, replace keepcellfmt firstrow(var)
    frame change default
    frame drop frame1
    I would appreciate any help on this issue, thank you.
    Last edited by Nitsan Machlis; 26 Jul 2022, 07:10.

  • #2
    When you are replacing values in the frame from the estimation results, you can specify conditions as in #2: https://www.statalist.org/forums/for...gression-table. In this case, the variable needs to be a string variable, so you will have something like

    Code:
    frame frame1: g Reg_SheelFE= ""
    ...  
    frame frame1: replace Reg_SheelFE= string(`=e(b)[1,1]', "%5.3f") in `c'

    But the highlighted part should be within the -cond()- function as illustrated in the linked thread.
    Last edited by Andrew Musau; 26 Jul 2022, 10:05.

    Comment


    • #3
      Hi Andrew, thank you for answering. I ran the regressions with this code and recieved the following excel results:

      Click image for larger version

Name:	Screenshot 2022-07-26 224024.png
Views:	1
Size:	26.8 KB
ID:	1675231


      The regression betas are stored as strings, without the actual betas and also without any asteriks.
      Why do you think this is not working?
      Thanks.

      Comment


      • #4
        Note that reghdfe is from SSC (FAQ Advice #12). You need to run this in a do-file due to the line breaks.

        Code:
        sysuse auto, clear
        frame create wanted
        frame wanted{
            set obs 100
            g Status=""
            g Conditions= ""
            g Coefficient= ""
            g Std_Err=.
            
        }
        *LIST CONDITIONS
        local condition1 regexm(make, "^AMC") & !foreign
        local condition2 disp<100 & price>7000
        local condition3 rep78==2 & inrange(headroom, 1, 5)
        *C EQUALS NO. OF CONDITIONS BELOW
        forval c= 1/3{
            capture noisily: corr mpg weight if `condition`c''
            if inlist(c(rc), 2000,2001){
                frame wanted: replace Status= "Insufficient Observations"  in `c'
                frame wanted: replace Conditions= `"`condition`c''"' in `c'
            }
            if !c(rc){
                reghdfe mpg weight if `condition`c'' , ab(rep78)
                frame wanted: replace Status= "OK" in `c'
                frame wanted: replace Conditions= `"`condition`c''"' in `c'
                local b = cond(r(table)[4,1]<0.01,"`:di %5.3f `=r(table)[1,1]''***", ///
                cond(r(table)[4,1]<0.05,"`:di %5.3f `=r(table)[1,1]''**", ///
                cond(r(table)[4,1]<0.1,"`:di %5.3f `=r(table)[1,1]''*",  "`:di %5.3f `=r(table)[1,1]''")))
                frame wanted: replace Coefficient= "`b'" in `c'
                frame wanted: replace Std_Err= `=r(table)[2,1]' in `c'
            }
        }
        frame change wanted
        export excel using myfile.xls, replace keepcellfmt firstrow(var)
        Res.:

        Click image for larger version

Name:	Capture.PNG
Views:	1
Size:	21.1 KB
ID:	1675246

        Comment


        • #5
          I ran the code with the local "b". The only remaining problem is that my regression beta are relatively small, and the string seems to only save the first 3 decimal places:
          [ATTACH=CONFIG]n1675294[/ATTACH]
          This means that each beta shows as 0.000 instead of the real number. Is there a way to change local so that more decimals are shown in each cell? I am new to stata and haven't been able to figure this out. Also, is it necessary to redefine "b" each time I run reghdfe or is once after the first regression enough?
          Again thank you so much for your help.

          Comment


          • #6
            Originally posted by Nitsan Machlis View Post
            I ran the code with the local "b". The only remaining problem is that my regression beta are relatively small, and the string seems to only save the first 3 decimal places:

            This means that each beta shows as 0.000 instead of the real number. Is there a way to change local so that more decimals are shown in each cell?
            Yes. You change the format within the -cond()- function. See

            Code:
            help format
            Code:
            local b = cond(r(table)[4,1]<0.01,"`:di %13.12g `=r(table)[1,1]''***", ///
            cond(r(table)[4,1]<0.05,"`:di %13.12g `=r(table)[1,1]''**", ///
            cond(r(table)[4,1]<0.1,"`:di %13.12g `=r(table)[1,1]''*", "`:di %13.12g `=r(table)[1,1]''")))
            However, if this is the case for most of your coefficients, I would advise that you rescale your outcome or right-hand side variables. In the auto dataset, weight is measured in pounds. If I regress mileage on weight, I get

            Code:
            . sysuse auto, clear
            (1978 Automobile Data)
            
            . reg mpg weight
            
                  Source |       SS           df       MS      Number of obs   =        74
            -------------+----------------------------------   F(1, 72)        =    134.62
                   Model |   1591.9902         1   1591.9902   Prob > F        =    0.0000
                Residual |  851.469256        72  11.8259619   R-squared       =    0.6515
            -------------+----------------------------------   Adj R-squared   =    0.6467
                   Total |  2443.45946        73  33.4720474   Root MSE        =    3.4389
            
            ------------------------------------------------------------------------------
                     mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  weight |  -.0060087   .0005179   -11.60   0.000    -.0070411   -.0049763
                   _cons |   39.44028   1.614003    24.44   0.000     36.22283    42.65774
            ------------------------------------------------------------------------------

            To rescale the coefficient, I can express weight in, e.g., hundreds of pounds.

            Code:
            sysuse auto, clear
            replace weight= weight/100
            label variable weight "Weight (100 lbs.)"
            regress mpg weight
            Res.:

            Code:
            . regress mpg weight
            
                  Source |       SS           df       MS      Number of obs   =        74
            -------------+----------------------------------   F(1, 72)        =    134.62
                   Model |  1591.99021         1  1591.99021   Prob > F        =    0.0000
                Residual |  851.469254        72  11.8259619   R-squared       =    0.6515
            -------------+----------------------------------   Adj R-squared   =    0.6467
                   Total |  2443.45946        73  33.4720474   Root MSE        =    3.4389
            
            ------------------------------------------------------------------------------
                     mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  weight |  -.6008687   .0517878   -11.60   0.000    -.7041058   -.4976315
                   _cons |   39.44028   1.614003    24.44   0.000     36.22283    42.65774
            ------------------------------------------------------------------------------

            Also, is it necessary to redefine "b" each time I run reghdfe or is once after the first regression enough?
            It has to be each time as the coefficients are different from one regression to the next.
            Last edited by Andrew Musau; 27 Jul 2022, 04:40.

            Comment


            • #7
              Great this works, thank you. However, some of the results have no asterisk and are stored as numbers instead of strings:
              Click image for larger version

Name:	Screenshot 2022-07-27 190332.png
Views:	1
Size:	10.9 KB
ID:	1675365


              Are these mistakes or simply not significant results? Notice that some results are stored as scientific notation while others are not.

              Comment


              • #8
                Originally posted by Nitsan Machlis View Post
                Great this works, thank you. However, some of the results have no asterisk and are stored as numbers instead of strings:
                Are these mistakes or simply not significant results?
                Those coefficients are not significant at the 10% level of significance.

                Notice that some results are stored as scientific notation while others are not.
                That's a feature of the general display format (%#.#g). Consider the output of the fixed display format (%#.#f) which requires you to change the g's to f's in the code.

                Code:
                di %13.12g 0.0000001373608
                di %13.12f 0.0000001373608
                Res.:

                Code:
                . di %13.12g 0.0000001373608
                 1.373608e-07
                
                . di %13.12f 0.0000001373608
                0.000000137361

                are stored as numbers instead of strings
                You can have the last part of the condition

                "`:di %5.3f `=r(table)[1,1]''"
                read


                Code:
                "`=string(`=r(table)[1,1]', "%13.12f")'"
                Last edited by Andrew Musau; 27 Jul 2022, 10:37.

                Comment

                Working...
                X