Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • need help deciding which logit model to pick and to understand the difference between them

    I have a question regarding deciding which model to use for my analysis of my data with binary dependent variable.
    I am not quite sure if I need to use a binary panel regression (xt.logic) or, if I should use a regular logistic regression (logit).
    Most papers regarding this topic just use a simple logit model, which is in my opionion not good? (e.g http://dx.doi.org/10.2139/ssrn.3187667 p.17)
    In essence, I want to compare determinants on the binary outcomes (issuance of a bond) over the 1. entire period, 2. the period before COVID, and 3. the period during Covid.


    Questions i am not sure about:
    1. should i use normal logit or xtlogit?
    2. If xtlogit: does it make sense to use a random effect model? For most model specifications the hausman tests suggests
    3. what is the difference between the 2 variants (I dont really understand whats the difference):
    xtset Firm Date
    xtlogit BinaryOutcome Total Assets NetIncome, re or fe

    and
    xtset firm
    xtlogit BinaryOutcome Total Assets NetIncome i.date, re or fe
    4. Would my model of choice change if I switch to monthly or quarterly observations?

    Since i have some dummy variables that do not change variables i am considering a RE or simple logit model over the FE.

    Here is an example data set:
    Date Firm Binary Outcome Total Assets Net Income Dummy if certain country
    2016 A 0 50 20 1
    2017 A 0 60 30 1
    2018 A 1 70 40 1
    2019 A 1 65 60 1
    2020 A 0 85 45 1
    2021 A 0 90 54 1
    2016 B 1 20 18 0
    2017 B 0 34 13 0
    2018 B 1 44 18 0
    2019 B 0 45 5 0
    2020 B 1 60 15 0
    2021 B 0 80 7 0
    ... ...
    ... ... ...
    ... ... ...
    ... ... ...
    ... ... ...
    ... ... ...
    Thank you for your help in advance!
    Last edited by Peter Selt; 23 Mar 2022, 09:22.

  • #2
    Peter:
    welcome to this forum.
    You have three wave of data obtained by the same sample: hence you have a panel dataset.
    If you are interested in exploring the within-panel variation as time goes by, -xtlogit,fe- is the way to go, provided that, due to the incidental parameter bias, it gives back conditional fixed effects (see
    http://www.econ.brown.edu/Faculty/Tony_Lancaster/papers/IncidentalParameters1948.pdf);
    If you're interested in exploring the withi-panel variation, go -xtlogit,re-.
    If there's no evidence of panel-wise effect, go pooled logit.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      3. what is the difference between the 2 variants (I dont really understand whats the difference):

      xtset Firm Date
      xtlogit BinaryOutcome Total Assets NetIncome, re or fe

      andx

      xtset firm
      xtlogit BinaryOutcome Total Assets NetIncome i.date, re or fe
      In your case, there is no difference between -xtset Firm Date- and -xtset Firm-. The inclusion of Date in the -xtset- command enables you to use time series operators such as lag, lead, difference, or estimate models with autoregressive structures. But the modeling commands you are considering do not use any of those things, so inclusion of Date in -xtset- has no effect at all.

      Some users mistakenly think that specifying the time variable in -xtset- causes subsequent -xt-regression commands to automatically build a time fixed-effect into the model--but that is not true. I'm not sure where that fairly widespread mistaken belief comes from. The use of -xtset- does cause -xt-regression commands to automatically build in (or condition on) a panel fixed-effect, but not a time fixed effect.

      As for your -xtlogit- commands, they differ in that the second one includes a time fixed effect, and the first does not (regardless of which version of -xtset- you use).

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        In your case, there is no difference between -xtset Firm Date- and -xtset Firm-. The inclusion of Date in the -xtset- command enables you to use time series operators such as lag, lead, difference, or estimate models with autoregressive structures. But the modeling commands you are considering do not use any of those things, so inclusion of Date in -xtset- has no effect at all.

        Some users mistakenly think that specifying the time variable in -xtset- causes subsequent -xt-regression commands to automatically build a time fixed-effect into the model--but that is not true. I'm not sure where that fairly widespread mistaken belief comes from. The use of -xtset- does cause -xt-regression commands to automatically build in (or condition on) a panel fixed-effect, but not a time fixed effect.

        As for your -xtlogit- commands, they differ in that the second one includes a time fixed effect, and the first does not (regardless of which version of -xtset- you use).
        Thank you very much for your answer.
        I don't know exactly why I assumed this, but after a long time of research in various forums, I also assumed this. Probably because a lot of half-knowledge is generally thrown around.

        Comment


        • #5
          Originally posted by Carlo Lazzaro View Post
          Peter:
          welcome to this forum.
          You have three wave of data obtained by the same sample: hence you have a panel dataset.
          If you are interested in exploring the within-panel variation as time goes by, -xtlogit,fe- is the way to go, provided that, due to the incidental parameter bias, it gives back conditional fixed effects (see
          http://www.econ.brown.edu/Faculty/Tony_Lancaster/papers/IncidentalParameters1948.pdf);
          If you're interested in exploring the withi-panel variation, go -xtlogit,re-.
          If there's no evidence of panel-wise effect, go pooled logit.
          Thank you for your answer Carlo!
          So if I want to test the evaluation of my dummy variable (which is time invariant) and lagged financials , I have to use an RE model, right?

          Since I strongly assume that in the COVID-19 panel the data behave, I would exclude the pooled logit regression.

          I am sorry if this is a trivial question.
          Last edited by Peter Selt; 25 Mar 2022, 04:11.

          Comment


          • #6
            Peter:
            if you're referring to the time-invariant predictor -Dummy if certain country-, you're correct, in that its coefficient cannot be estimated via -xtlogit,fe- and you've to switch to -xtlogit,re-.
            You're also correct in stating that pooled -logit- is the way to go when -.xtlogit- doesn't give you back teh evidence of s panel-wise effect.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Thank your very much Carlo, you have helped me a lot to understand the topic.

              Comment


              • #8
                I have now more or less finished my analysis, but would like to ask again briefly what means are available to me at xtlogit to indicate the goodnes of fit?
                There is no R^2 or Pseudo R^2 for xtlogit, re
                and in older forum contributions the question was never really answered.
                Since I am unfortunately not so familiar with the xtlogit function, I have to make sure here again. Maybe I'm also on the hokey and on the outputs there is a measured value for it.

                I hope you could answer me this last question.

                best
                Peter



                Comment


                • #9
                  Peter:
                  -xtlogit,re- gives back -chi2- instead of Pseudo R^2 stat.
                  In addition, you may want to take a look at:
                  1) https://www.statalist.org/forums/for...of-xtlogit-re;
                  2) https://www.stata.com/statalist/arch.../msg00818.html
                  Kind regards,
                  Carlo
                  (Stata 19.0)

                  Comment


                  • #10
                    Okay, thank you very much.


                    I still need an answer for understanding.

                    Once again related to the above example, if the Independet variables are lagged by one year.

                    Not a single paper I have found so far uses a logistic panel regression for the estimation for such an analysis.

                    Why can they do this?
                    Do they assume that the decision of the firm does not depent on time or the change in independet variables?
                    Or can they do this because they only want to compare the effects/size of the independet varaibles and do not care if there is a time related panel?

                    I am sorry for troubling you with such questions but i cannot really understand why I am the only one trying to use a panel logit, while every one else does a simple logistic regression.


                    Comment


                    • #11
                      Peter:
                      it may well be that previous researches:
                      1) unifrom to a customary rule in your research field to go (pooled?) logistic;
                      2) went pooled logistic because they detected no evidence of a panel-wise effect;
                      3) what above has nothing to do with lagging (or not) an independent variable;
                      In addition, the paper you quoted:
                      a) does not mention a panel structure of the analyzed datatset;
                      b) reported on a lagged logistic regression without cluster robust standard error (which is the way to go when you prefer, for questionable methodological reasons, to apply cross-sectional logistic regeresion instead of its panel counterpart (provided that you actually have panel data).

                      To make a long story short, I would doubl-check whether the paper you quoted actually deals with panel data.
                      Kind regards,
                      Carlo
                      (Stata 19.0)

                      Comment

                      Working...
                      X