Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating JAMA-style tables

    Greetings:

    Is there an easy way to create JAMA-style tables? I am used to using esttab/estout for making social science / economics style tables. But JAMA tables format regression results differently and I am finding it too tedious to make JAMA tables by hand. Instead of the outcome in the column and treatment/covariates in the rows, the outcome is in the row. Here is an example from a published JAMA paper:


    [ATTACH=CONFIG]n1732392[/ATTACH]
    Last edited by Hisab Shagird; 01 Nov 2023, 20:48.

  • #2
    Another example:
    Click image for larger version

Name:	fig3.jpg
Views:	1
Size:	137.9 KB
ID:	1732395

    Comment


    • #3
      If you have Stata 17 or newer, you can use the collect suite of commands to build your custom tables. While the second example has a graph in it, which collect is not specifically built to handle, here is my attempt to reproduce a table that looks like the first example.

      Code:
      clear all
      
      * simulate some data
      set obs 70
      set seed 18
      gen x = runiformint(0,1)
      gen z1 = rnormal()
      gen z2 = rnormal()
      gen z3 = rnormal()
      gen z4 = rnormal()
      gen z5 = rnormal()
      matrix b = 1, -1/2, 1/3, -1/4, 1/5, -1/6, 0
      matrix colname b = x z1 z2 z3 z4 z5 _cons
      matrix score lnlambda = b
      gen lambda = exp(-lnlambda)
      gen y1 = -lambda*log(1-runiform())
      label var y1 "All-cause mortality"
      gen y2 = -lambda*log(1-runiform())
      label var y2 "Cardiovascular mortality"
      gen y3 = -lambda*log(1-runiform())
      label var y3 "Rehospitalization"
      gen y4 = -lambda*log(1-runiform())
      label var y4 "Composite of mortality and rehospitalization"
      gen y5 = -exp(-lnlambda-1.5*x)*log(1-runiform())
      label var y5 "Aortic valve reintervention"
      gen d = 1
      
      * do some analysis, and collect the results
      unab yvars : y?
      foreach y of local yvars {
          stset `y', fail(d)
          streg i.x, dist(exp)
          collect get e(), tag(fit[Univariable] var[`y'])
          if "`y'" != "y5" {
              streg i.x z?, dist(exp)
              collect get e(), tag(fit[Multivariable] var[`y'])
          }
      }
      
      * apply some style edits to match the original example
      collect style cell, empty("NA")
      collect style header colname, title(hide) level(hide)
      collect style cell result[_r_b _r_lb _r_ub], nformat(%12.2f)
      collect composite define ci = _r_lb _r_ub, trim
      collect style cell result[ci], sformat("(%s)")
      collect composite define coef_ci = _r_b ci, trim
      collect label levels ///
          result _r_b "HR" ///
          ci "__LEVEL__% CI" ///
          _r_p "P value" ///
          , modify
      collect style cell result[_r_p], nformat(%5.3f) minimum(.001)
      collect style cell, halign(left)
      collect style column, dups(first)
      collect style cell border_block[corner row-header], ///
          border(right, pattern(none))
      
      * build the table
      collect layout (var) (fit#result[coef_ci _r_p]#colname[1.x])
      Here is the resulting table.
      Code:
      ------------------------------------------------------------------------------------------------
                                                   Univariable                Multivariable           
                                                   HR (95% CI)        P value HR (95% CI)      P value
      ------------------------------------------------------------------------------------------------
      All-cause mortality                          1.45 (0.91 2.32)   0.118   1.84 (1.12 3.01) 0.015  
      Cardiovascular mortality                     1.57 (0.98 2.51)   0.059   2.43 (1.49 3.96) <0.001 
      Rehospitalization                            2.30 (1.44 3.68)   <0.001  2.99 (1.83 4.91) <0.001 
      Composite of mortality and rehospitalization 3.44 (2.15 5.49)   <0.001  3.53 (2.16 5.75) <0.001 
      Aortic valve reintervention                  15.44 (9.66 24.67) <0.001  NA               NA     
      ------------------------------------------------------------------------------------------------

      Comment


      • #4
        Jeff Pitblado (StataCorp) many thanks. Yes, I have Stata 18. I have been trying to use --table regression-- today but will try --collect--

        One of the issues is that there aren't many examples around so your response is quite helpful. Also, LLMs are not helpful at all as I presume that not a lot of training examples were available for new --table-- and --collect--

        I am going to tweak your example for a simple linear regression. Thanks again.

        Comment


        • #5
          Many thanks Jeff for sharing the code. Is it possible to produce a table together with the forest plot (as posted by Hisab)?

          Comment


          • #6
            Jeff Pitblado (StataCorp) : may I ask how one could add standard statistics from the regression results as well as any added custom statistics (e.g., added by --estadd local--)?

            I tried
            collect layout (var) (fit#result[coef_ci _r_p N]#colname[1.x])
            to add the number of observation. But nothing shows up for observations.

            However, the following is true:

            Code:
            . collect label list result
            
              Collection: default
               Dimension: result
                   Label: Result
            Level labels:
                       N  Number of observations
                  N_fail  Number of failures
                   N_sub  Number of subjects
                    _r_b  HR
                   _r_ci  __LEVEL__% CI
                   _r_df  df
                   _r_lb  __LEVEL__% lower bound
                    _r_p  P value
                   _r_se  Std. error
                   _r_ub  __LEVEL__% upper bound
                    _r_z  z
                _r_z_abs  |z|
                    chi2  χ²
                chi2type  Type of model χ² test
                      ci  __LEVEL__% CI
                     cmd  Command
                    cmd2  Base or alternate command
                 cmdline  Command line as typed
                 coef_ci  HR (__LEVEL__% CI)
                    dead  Dead variable, _d
                  depvar  Dependent variable
                    df_m  Model DF
                    frm2  Estimation metric
                      ic  Number of iterations
                       k  Number of parameters
                    k_dv  Number of dependent variables
                    k_eq  Number of equations
              k_eq_model  Number of equations in overall model test
                      ll  Log likelihood
                    ll_0  Log likelihood, constant-only model
               ml_method  Type of ml method
                     opt  Optimization type
                       p  Model test p-value
                 predict  Program used to implement predict
             predict_sub  Predict subprogram
              properties  Command properties
                    rank  Rank of VCE
                      rc  Return code
                    risk  Total time at risk
                 stcurve  stcurve
                      t0  System time variable
               technique  Maximization technique
                   title  Title of output
                    user  Likelihood-evaluator program
                     vce  SE method
                   which  Optimization direction

            Comment


            • #7
              Shafiur Rahman, regarding the forest plot question, I am not aware of a single command that does this. You would have to build the table and plot separately, then arrange them as you want in a target MS Word or LaTeX document.

              Comment


              • #8
                Hisab Shagird, you need to add colname[1.x] to the tags on result N to get this scalar to show up using your layout. Here is the code I added to the above to make the sample size show up on the table.
                Code:
                foreach y of local yvars {
                        collect addtags colname[1.x], fortags(result[N])
                }
                collect label levels result N "N", modify
                collect layout (var) (fit#result[N coef_ci _r_p]#colname[1.x])
                Here is the resulting table.
                Code:
                ------------------------------------------------------------------------------------------------------
                                                             Univariable                            Multivariable                         
                                                             N  HR (95% CI)        P value N  HR (95% CI)      P value
                ------------------------------------------------------------------------------------------------------
                All-cause mortality                          70 1.45 (0.91 2.32)   0.118   70 1.84 (1.12 3.01) 0.015  
                Cardiovascular mortality                     70 1.57 (0.98 2.51)   0.059   70 2.43 (1.49 3.96) <0.001 
                Rehospitalization                            70 2.30 (1.44 3.68)   <0.001  70 2.99 (1.83 4.91) <0.001 
                Composite of mortality and rehospitalization 70 3.44 (2.15 5.49)   <0.001  70 3.53 (2.16 5.75) <0.001 
                Aortic valve reintervention                  70 15.44 (9.66 24.67) <0.001  NA NA               NA     
                ------------------------------------------------------------------------------------------------------

                Comment


                • #9
                  Jeff Pitblado (StataCorp) May I ask if you (or Stata Corp) has any recommendation to quickly develop basic skills in using --collect--? I have been reading the Customizable Tables Reference Manual but have not gotten very far (as you can judge from my previous question).

                  I understand the --collect-- is necessarily complex to maintain generalizability but it is taking a while for me to construct simple solutions even when you gave a starting point (and I have a prior background in CS so lots of experience with all flavors of programming/scripting languages). Most users, like me I presume, will not be able to spend too much time in mastering the complexity of --collect-- with the Customizable Tables Reference Manual. I think many more example than there are currently on Stata blog will help (and eventually these examples will also get picked up by LLM [a nice to have would be Stata co-pilot ].

                  In my current application, I have one outcome but several models so the table becomes too wide and unwieldy. So, I have been trying to reformat your table like so but have not made much progress.

                  Code:
                   
                  All-cause mortality
                  Univariable
                  HR 1.45
                  95% CI (0.91 2.32)
                  P value 0.118
                  N 70
                  Multivariable
                  HR 1.84
                  95% CI (1.12 3.01)
                  P value 0.015
                  N 70.

                  At this time, sadly, I am thinking of resorting to just manually building tables.
                  Last edited by Hisab Shagird; 06 Nov 2023, 10:01.

                  Comment


                  • #10
                    I should not be telling on this forum, but the R library gtsummary might actually makes things a lot easier for your use case. Check it out if you have not already done so. I briefly used it for a medical demographic table which looked pretty awesome with the least bit of coding, but later realized I it is easier for me to have a one-stop-shop with Stata and keep all formatting and spacing consistent. Took me a while to get a customized version with dtable (Stata 18 only though) while pilfering chunks of code from Jeff Pitblado (StataCorp) 's posts. But I finally have one for a demographic table and second one for multiple models (Cox, 45-day landmark and competing risk model) with the same predictors (rows) and 24-month mortality with models as column variables.

                    eel free to email me if you want a sample code (no data) to get started. I might even drop the code in the Sandbox one of these days. I don't intend changing my table structures beyond this for my work. coefplot from ssc by Ben Jann is also neat for plots like these which I configured for my use case.
                    Last edited by Girish Venkataraman; 06 Nov 2023, 11:14. Reason: changed column var details.

                    Comment


                    • #11
                      Hisab Shagird here is the layout I used to reproduce your example output.
                      Code:
                      collect layout (var[y1]#fit#result[N _r_b ci _r_p]) (colname[1.x])
                      Here is the table.
                      Code:
                      -------------------------------
                      All-cause mortality            
                        Univariable                  
                          N               70         
                          HR              1.45       
                          (95% CI)        (0.91 2.32)
                          P value         0.118      
                        Multivariable                
                          N               70         
                          HR              1.84       
                          (95% CI)        (1.12 3.01)
                          P value         0.015      
                      -------------------------------
                      For more specific help/advice, it helps if you provide us with data (via dataex) and your code, along with a description or example of what you are trying to accomplish.

                      Comment


                      • #12
                        Jeff Pitblado (StataCorp) : your help is much appreciated. I will post data/code when I can (perhaps simulated data, as the project data is medical). Also, other question was not rhetorical: I would appreciate if you (or Stata Corp) have suggestions to resources to get up to speed with --collect--. Hopefully, I can get the --collect layout-- done on my own for most tasks without having to post every small thing here on the forum.

                        Girish Venkataraman : many thanks for your tips and the kind offer. I am going to try R::gtsummary() as I am forced to use both R and Stata because of the nature of my analysis . A GitHub repo or even a Stata forum post would be certainly useful to many.

                        Comment

                        Working...
                        X