Calculating RR after firth logistic regression

Trishenth Fonseka

Join Date: May 2025

Posts: 3
#1

Calculating RR after firth logistic regression

15 May 2025, 20:35

Hello everyone. Is there a method to calculate relative risks for a sample of 24 patients with firth logistic regression method. As Chatgpt suggested, I have used a bootstrap method and it gave some bizarre results with very wide confidence. Then I posted on reddit and got this reply: 'You can estimate adjusted means using the margins command and then divide them using nlcom to get the ratios (use tesnl to get the appropriate pvalue). Norton wrote a custom stata command to estimate all this-- you can get it through SSC.' So I initially used adjrr command and it did not recognize firthlogit. Then I tried nlcom/tesnl method and it still did not work. Any help with this problem is greatly appreciated.

Last edited by Trishenth Fonseka; 15 May 2025, 20:54.
Tags: firth, firthlogit, regression, Suggestion
Maarten Buis

Join Date: Mar 2014

Posts: 3455
#2

16 May 2025, 00:47

Let's take a step back. The purpose of a model is to simplify the data. If we have 20,000 observations and 200 variables, we do not want our results section to look like

Person 1 answered on question 1 .... on question 2 ... on question 200 ...
Person 2 answered on question 1... on question 2...
...
Person 20,000 answered....

With 24 observations you do not have that problem.

Why go through all the complications of first a firthlogit, than a bootstrap, and than use the delta method for standard errors of the risk ratios to solve a problem you do not have? Remember that when you stack method on method on method your results do not become more robust. Instead you are just adding points of failure, and your results become more and more fragile.

With 24 observations the best you can do is a bivariate regression, i.e. you do not have the observations to add control variables. So there is no added value of a model compared to a simple cross-tabulation. So that is my recommendation: forget all that modelling stuff and just report the cross-tabulation.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
2 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35685
#3

16 May 2025, 03:09

See also the cross-posting at https://www.reddit.com/r/stata/comme...ic_regression/

Please keep each site informed about cross-postings.
Comment

Maarten Buis

Join Date: Mar 2014
Posts: 3455

16 May 2025, 03:16

Code:

. // create example data
. clear all

. input case exposure freq

          case   exposure       freq
  1. 1 1 9
  2. 1 0 3
  3. 0 1 3
  4. 0 0 9
  5. end

.
. expand freq
(20 observations created)

. drop freq

.
. label define case_lb 0 "control" 1 "case"

. label define exposure_lb 0 "unexposed" 1 "exposed"

. label value case case_lb

. label value exposure exposure_lb

.
. // ---------------------------------------------------------
. // now the real work starts
.
. // create the table
. table (case) (exposure),       ///
>    stat(percent, across(case)) ///
>    stat(freq)                  ///
>    nformat(%9.0f)              ///
>    sformat("%s%%" percent)     ///
>    sformat("(%s)" frequency)

--------------------------------------------
              |            exposure         
              |  unexposed   exposed   Total
--------------+-----------------------------
case          |                             
  control     |                             
    Percent   |        75%       25%     50%
    Frequency |        (9)       (3)    (12)
  case        |                             
    Percent   |        25%       75%     50%
    Frequency |        (3)       (9)    (12)
  Total       |                             
    Percent   |       100%      100%    100%
    Frequency |       (12)      (12)    (24)
--------------------------------------------

.   
. collect layout (case) (exposure#result)

Collection: Table
      Rows: case
   Columns: exposure#result
   Table 1: 4 x 6

----------------------------------------------------------------------------
          |                              exposure                           
          |       unexposed              exposed                Total       
          |  Percent   Frequency   Percent   Frequency   Percent   Frequency
----------+-----------------------------------------------------------------
case      |                                                                 
  control |      75%         (9)       25%         (3)       50%        (12)
  case    |      25%         (3)       75%         (9)       50%        (12)
  Total   |     100%        (12)      100%        (12)      100%        (24)
----------------------------------------------------------------------------

. collect style header exposure, title(hide)

. collect style header case, title(hide)

. collect style header result, level(hide)

. collect notes "Frequencies in parentheses"

. collect preview

--------------------------------------------------
        |   unexposed      exposed        Total   
--------+-----------------------------------------
control |   75%    (9)    25%    (3)    50%   (12)
case    |   25%    (3)    75%    (9)    50%   (12)
Total   |  100%   (12)   100%   (12)   100%   (24)
--------------------------------------------------
Frequencies in parentheses

.
. // compute the risk ratio
. cs case exposure

                 |        exposure        |
                 |   Exposed   Unexposed  |      Total
-----------------+------------------------+-----------
           Cases |         9           3  |         12
        Noncases |         3           9  |         12
-----------------+------------------------+-----------
           Total |        12          12  |         24
                 |                        |
            Risk |       .75         .25  |         .5
                 |                        |
                 |      Point estimate    |    [95% conf. interval]
                 |------------------------+------------------------
 Risk difference |               .5       |     .153524     .846476
      Risk ratio |                3       |    1.067821    8.428375
 Attr. frac. ex. |         .6666667       |    .0635139    .8813532
 Attr. frac. pop |               .5       |
                 +-------------------------------------------------
                               chi2(1) =     6.00  Pr>chi2 = 0.0143

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

Trishenth Fonseka

Join Date: May 2025

Posts: 3
#5

16 May 2025, 11:20

Thanks for providing the correct methods. i used the firth LR and found one statistically significant exposure after adjustment for other covariates. although i would use the above mentioned methods along with fisher's exact test to do a univariate analysis, it would not work when testing for multiple variables unlike a logistic regression. this is why i was in a dilemma on whether there is a method to get a relative risk after adjusting for covariates with reduced bias.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30090
#6

16 May 2025, 11:52

In general, if you must have a relative risk, not an odds ratio, and if you have a data set large enough that adjustment for covariates is appropriate, then it is best not to use a logistic model in the first place. You can use -binreg- or -poisson- to get relative risks directly.

That said, I do not understand the insistence on relative risks rather than odds ratios that some investigators make. While I agree that risk ratios are somewhat more intuitive than odds ratios, both are legitimate ways to express an association with a dichotomous outcome, and a modicum of experience handling odds ratios is all that is needed to become familiar with them. Moreover, in the case where the risk ratio is small, the odds ratio will approximately equal the risk ratio as is.
3 likes
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35685
#7

16 May 2025, 13:21

Pedantry corner: firthlogit is the command here but the technique is named for David Firth (b. 1957), so Firth logit is to be preferred otherwise.
Comment
Tiago Pereira

Join Date: Jan 2016

Posts: 387
#8

16 May 2025, 15:36

Originally posted by Clyde Schechter View Post

In general, if you must have a relative risk, not an odds ratio, and if you have a data set large enough that adjustment for covariates is appropriate, then it is best not to use a logistic model in the first place. You can use -binreg- or -poisson- to get relative risks directly.

That said, I do not understand the insistence on relative risks rather than odds ratios that some investigators make. While I agree that risk ratios are somewhat more intuitive than odds ratios, both are legitimate ways to express an association with a dichotomous outcome, and a modicum of experience handling odds ratios is all that is needed to become familiar with them. Moreover, in the case where the risk ratio is small, the odds ratio will approximately equal the risk ratio as is.

Perhaps Clyde intended to point out that, when the overall risk is low, the RR and OR tend to be approximately equal?

Last edited by Tiago Pereira; 16 May 2025, 15:39.
2 likes
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30090
#9

16 May 2025, 16:58

Yes, that is precisely what I intended. What I said is garbled and, in fact, incorrect. Thank you, Tiago, for correcting it.
1 like
Comment
Trishenth Fonseka

Join Date: May 2025

Posts: 3
#10

18 May 2025, 01:05

Thank you all for the insights.
Comment

Announcement

Calculating RR after firth logistic regression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment