firthlogit in STATA

Kaibalyapati Mishra

Join Date: Jul 2023

Posts: 11
#1

firthlogit in STATA

14 May 2025, 00:54

Hi,

I am using firthlogit in Stata on a small sample (120).

Below is the result:

As you can see, the OR of the information variable is very high with wide CIs and huge SE. I have checked for multicolinearity issues but it is not there.
Moreover, when calculating margins, dydx(*) I am getting it equal to the coefficient of the main model. Results below:

The sample is primary data-based and unbalanced. See below the tabulation of the dependent variable (meeting [binary]) and the independent variable (information).

Now is it good to go with these estimates? Moreover, given the very small sample size can I ignore the p-values as it comes significant in most cases?

Last edited by Kaibalyapati Mishra; 14 May 2025, 01:06.
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3467
#2

14 May 2025, 01:29

On Statalist we don't use screenshots (in your case they are unreadable anyhow). Instead we copy the text and paste them in the message in a code block (the # in the bar when you type a message).

My guess is that you either have very few 0s or very few 1s. In that case your data just does not contain a lot of information. That is not very satisfactory, but the honest thing to do when you don't know something is to say that you don't know. That seems to be your case: your data just does not contain enough information for you to answer your question. To quote John Tukey (1986): "The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data".

As to your last question: definitely not. First, you got it the wrong way around: in small samples the p-values tend to be large (i.e. "not significant"). Second, research with small samples is actually the case where p-values can be somewhat useful: We humans are very good at seeing "patterns" in random noise (e.g. the Rorschach or inkblot test https://en.wikipedia.org/wiki/Rorschach_test ). Statistical tests can help us prevent making such mistakes. Such random noise with apparent patterns are especially common in small datasets. So, in very large datasets statistical tests contain little information, but in small datasets they are a useful first step. This comes back to my first point: if the data just isn't good enough to answer our question, then the right thing to do is to say that and start collecting better data.

I realize that that is not the answer you are hoping for, but sometimes somebody has to give you bad news, and this time that somebody is me.

Tukey, J. W. (1986). Sunset Salvo. The American Statistician, 40(1), 72–76.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment

Kaibalyapati Mishra

Join Date: Jul 2023
Posts: 11

14 May 2025, 02:52

Dear Prof.,

Thank you so much for your response. Sorry about the screenshots. I have pasted the results into the code block below. I incorrectly wrote that, P-values come significant instead of insignificant as you can see below:
Can you comment on what seems to be the problem now? Or is it still the quality of data?

Tabulation results:

Code:

 tabulate information        - Independent variable with high odds

    Receive |
Information |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        177       73.75       73.75
          1 |         63       26.25      100.00
------------+-----------------------------------
      Total |        240      100.00



. tab meeting   -                             Dependent Variable

Attended WC |
    Meeting |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        195       81.25       81.25
          1 |         45       18.75      100.00
------------+-----------------------------------
      Total |        240      100.00

. tab meeting information

  Attended |  Receive Information
WC Meeting |         0          1 |     Total
-----------+----------------------+----------
         0 |       174         21 |       195 
         1 |         3         42 |        45 
-----------+----------------------+----------
     Total |       177         63 |       240

OR results:

Code:

                                                        Number of obs =    120
                                                        Wald chi2(14) =  15.02
Penalized log likelihood = -10.932615                   Prob > chi2   = 0.3770

-------------------------------------------------------------------------------
      meeting | Odds ratio   Std. err.      z    P>|z|     [90% conf. interval]
--------------+----------------------------------------------------------------
_Iinformati_1 |      42.45   50.72137     3.14   0.002      5.94752    302.9839
     _Islum_1 |   .9995229   1.455872    -0.00   1.000     .0910527    10.97217
          age |   1.027918   .0391524     0.72   0.470     .9654939    1.094378
_Imember_po_1 |   30.62242   50.64336     2.07   0.039     2.016722    464.9787
     _Iclub_1 |   .1385785   .1807705    -1.52   0.130     .0162126    1.184514
   _Igender_1 |   .8898505   1.152278    -0.09   0.928     .1057538    7.487525
_Icouncilor_1 |   6.128302   7.505467     1.48   0.139     .8174456    45.94322
_Isociety_c_1 |   8.907647    15.9983     1.22   0.223      .464275    170.9034
    _Icaste_1 |    .924658   1.182606    -0.06   0.951     .1128109    7.578986
_Iownership_2 |   1.872342   1.900229     0.62   0.537     .3526913    9.939756
       hhsize |   .5333421   .1655405    -2.03   0.043     .3200982    .8886452
    log_per_Y |   .4465741   .5616841    -0.64   0.522      .056417    3.534901
_Ieducation_2 |   .5035438   .6015231    -0.57   0.566     .0705811    3.592412
_Ieducation_3 |   .2366295    .392097    -0.87   0.384     .0155019    3.612051
        _cons |   .3215287   1.830108    -0.20   0.842     .0000276    3743.097

dydx(*) Results:

Code:

-------------------------------------------------------------------------------
              |            Delta-method
              |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
--------------+----------------------------------------------------------------
_Iinformati_1 |   3.748327    1.19485     3.14   0.002     1.406465    6.090189
     _Islum_1 |  -.0004772   1.456567    -0.00   1.000    -2.855296    2.854341
          age |   .0275354   .0380891     0.72   0.470    -.0471178    .1021887
_Imember_po_1 |   3.421732     1.6538     2.07   0.039     .1803437    6.663121
     _Iclub_1 |  -1.976318   1.304463    -1.52   0.130    -4.533019    .5803826
   _Igender_1 |  -.1167018   1.294912    -0.09   0.928    -2.654682    2.421278
_Icouncilor_1 |   1.812918   1.224722     1.48   0.139    -.5874936    4.213329
_Isociety_c_1 |    2.18691   1.796019     1.22   0.223    -1.333222    5.707043
    _Icaste_1 |  -.0783314   1.278965    -0.06   0.951    -2.585057    2.428395
_Iownership_2 |   .6271902   1.014894     0.62   0.537    -1.361966    2.616346
       hhsize |  -.6285923   .3103833    -2.03   0.043    -1.236932   -.0202522
    log_per_Y |  -.8061499   1.257762    -0.64   0.522    -3.271319    1.659019
_Ieducation_2 |  -.6860845   1.194579    -0.57   0.566    -3.027417    1.655248
_Ieducation_3 |   -1.44126   1.657008    -0.87   0.384    -4.688935    1.806416

Comment

Nick Cox

Join Date: Mar 2014

Posts: 35782
#4

14 May 2025, 04:54

I fear that I can't be more optimistic on your behalf than was Maarten Buis. Your detailed results imply that you have many missing values on individual variables, which drop out of the model fit. So you are estimating 15 parameters from 120 observations; 120 or more observations fall by the wayside. Without wanting to live or die by more precise rules, I would call that a stretch, implying a need to think about simpler models. But simpler models wouldn't be more successful; just simpler....

For social science, all seems about expectable, however. Wouldn't you be surprised if you could predict whether people attend a meeting (or whatever the outcome variable is) really just from some simple personal characteristics? My own attendance or non-attendance at meetings is often driven by purely personal and/or transient circumstances that wouldn't be in a dataset, and my guess is that such noise (statistical sense) is typical. .
Comment
Kaibalyapati Mishra

Join Date: Jul 2023

Posts: 11
#5

14 May 2025, 05:40

Thank you for your response Nick Cox .
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5024
#6

14 May 2025, 08:55

A few additional comments:

I'm guessing you used the xi: prefix. That isn't necessary, since firthlogit supports factor variables. Using factor variables produces nicer looking output and is also often necessary to get correct results from margins.

The default predict option for firthlogit is xb. Therefore, since you don't have any interactions or product terms, the marginal effects will be the same as the unexponentiated coefficients.

Most of your variables do little or nothing. Especially with a small N, junk variables can increase the standard errors for all the variables and make it harder for any of them to be statistically significant. Rethink whether all these variables are theoretically necessary. Even if they are, you may have to sacrifice something because your sample size is too small to detect effects.

Why are you losing so much data? Is one variable in particular poorly measured with a lot of md zapping you? If so consider dropping it. Or, are there alternative measures with less MD that you could use instead?

Having said all that, it may just be that you do not have adequate data for testing your ideas. As Maarten says, "if the data just isn't good enough to answer our question, then the right thing to do is to say that and start collecting better data."

Or, as Nick says, "Wouldn't you be surprised if you could predict whether people attend a meeting (or whatever the outcome variable is) really just from some simple personal characteristics?" Even a sample of 10,000 cases might not show very strong relationships.

But you can still try to do more with your current data. Simplify the model to use fewer variables and/or see if there are reasonable ways to reduce the number of cases lost to MD.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
Comment
Mukesh Punia

Join Date: May 2020

Posts: 111
#7

14 May 2025, 11:30

To see how the meeting is distributed, as Maarten indicated, there might be an issue of highly skewed zeros or ones. Just do the tabulation and cross-tab after running your regression.

Code:

firthlogit y x1 x2 ...

Code:

tab meeting if e(sample)==1

Best regards,
Mukesh
Comment

Announcement

firthlogit in STATA

Comment

Comment

Comment

Comment

Comment

Comment