Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic Regression with control variables

    Dear people,

    I am struggling for a few days now to get my logistic regression running.
    Whenever I enter and run the command, it has over 1500 iterations and it keeps running... I guess I am doing something wrong here.

    My dependent variable is "goingconcern", which is a dummy variable for a going-concern opinion ("1" if the firms receives a going-concern opinion, "0" otherwise)

    Than I have four models with four different independent variables:
    Model 1: LEAD, which represents whether an auditor is an industry leader (Dummy variable)
    Model 2: MarketShare, which represents the market share of an auditor
    Model 3: Dummies for different levels of market share
    Model 4: MarketShare and MarketShare^2

    In addition to those variables I also have several control variables: LOG_MKT, LEV, LOSS (Dummy), ALTMAN, BTM, CFO, ROA and ROAL.
    Furthermore, I actually would like to add year-fixed effects and industry fixed effects as well.

    Does anyone know the right command to run this clogit or logit in STATA?


  • #2
    Tessa:
    as per FAQ, please report at leat what you typed.
    Does the endless stream of iterations concerns all your models or only a part of them?
    Are you sure that, according to the literature concerning your reserach field, all the predictors are worth being plugged in?
    What happens to your models when you start the regression anew adding one predictor at time? Can you spot when Stata starts delaying convergence?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      The command I used in STATA was logit goingconcern MarketShare LOG_MKT LEV ROA ROAL LOSS CFO BTM z_altman accruals

      I already decreased the number of control variables that I found in previous literature. The endless stream of iterations concerns all my models.. I ran the logistic regression for both the LEAD- and the MarketShare model and both of them provided me with an endless stream of iterations.. I ran the regression half an hour ago and it is still running.

      I have 39,000 observation, can that be the problem? And if so, how should I tackle that problem?

      Comment


      • #4
        Tessa:
        are you dealing with cross-sectional or panel data?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Oh I'm sorry, quite important of course. I'm dealing with panel data, with a range 2005-2014

          Comment


          • #6
            If I see something like this my first suspision would be almost perfect prediction. I would just start with just one explanatory variable and add the rest one by one to see when the problem occurs. Than start looking at cross tabulations of the relevant variables and see if one of them almost perfectly predicts the outcome. If one variable does not show such a pattern I would look at combinations of two variables, etc. Once I identified the problem variable(s) I need to make a decision. Do I change variables, do I drop variables, do I use a different model, do I use a different estimator, ... Which one is right depends on the exact problem and the exact purpose of the study.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment


            • #7
              Tessa:
              keeping in mind Maarten's precious insights, I would take a look at -xtlogit- and related entry in Stata 13.1 .pdf manual.
              I would also recommend you to read carefully the section that covers -fixed effect- specification, which are in fact conditional fixed effects.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                I just investigated at what moment STATA starts with the endless stream of iterations and that is already when I add two control variables.
                STATA can handle logit goingconcern MarketShare LOG_MKT, but whenever I add one other control variable it goes nuts.

                The xtlogit is then the logistic regression that accounts for fixed effects? Just like xtreg?

                Comment


                • #9
                  You may want to read the manual before you ask a question. For example, the title on xtlogit says xtlogit - fixed-effects, random-effects, and population-averaged logit models.

                  Once you figure out why it is not running, you may also want to consider running all your explanatory variables at once. When you run separate regressions for each of the interesting (and probably correlated) X's, your estimates suffer from omitted variable bias.

                  To try to diagnose your problem, you might try running the model as a regression. The regression may suggest where the problems are. You might also want to look at colinearity. What do the correlations look like among your controls?

                  Comment


                  • #10
                    Tessa:
                    please report exactly what you typed; description of what happened during your Stata (not STATA, please) session are much less helpful than seeing your code, that you can easily past in your post (almost-rap accidental) via code delimiters (just click on the # button among the Advanced editor option). Thanks.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      logit goingconcern MarketShare LOG_MKT LEV ROA ROAL LOSS CFO BTM newnew accruals

                      This is the code I used

                      Comment


                      • #12
                        Tessa: If that is the command you typed, then it is just pooled logit. You are not doing random effects or fixed effects logit.

                        Having said that, I don't think any of us are going to be able to help unless you show more details. It appears this last model you estimated had Marketshare in linear form, without the dummies and quadratic. This seems a sensible start. What is LOG_MKT? Is it related to MarketShare?

                        And I thought one of your key variables was LEAD. What happened to it?

                        By the way, it never hurts to start with a linear model to see what is happening. In fact, you can use fixed effects on the linear model:

                        xtreg goingconcern MarketShare LOG_MKT LEV ROA ROAL LOSS CFO BTM newnew accruals, fe cluster(firmid)

                        would be a good starting point. You can easily add year dummies with i.year. This has some advantages, including you will see which variables drop out (due to lack of time variation). Also, the inference is robust to any serial correlation, unlike with xtlogit, fe.

                        Comment


                        • #13
                          Thank you Jeff.
                          To answer your questions first:
                          LOG_MKT is the natural logarithm of the market value of the firm
                          LEV is total leverage of the firm
                          ROA is the roa ratio
                          ROAL is the logarithm of this ratio
                          LOSS is a dummy for whether the firm has a loss or not
                          CFO is the cash flow of the firm
                          BTM is the growth opportunities
                          newnew stands for the altman z-score
                          And accruals is the absolute value of the total accruals in previous year.

                          One of my key variables was indeed LEAD, but these are two different regresssion so in fact the regressions I want to run are as follows:
                          1. logit goingconcern MarketShare LOG_MKT LEV ROA ROAL LOSS CFO BTM newnew accruals, fe cluster(firmid)
                          2, logit goingconcern LEADLOG_MKT LEV ROA ROAL LOSS CFO BTM newnew accruals, fe cluster(firmid)
                          3. logit goingconcern MarketShare MarketShare^2 LOG_MKT LEV ROA ROAL LOSS CFO BTM newnew accruals, fe cluster(firmid)
                          4. logit goingconcern Ms20a Ms40a Ms60a Ms80a LOG_MKT LEV ROA ROAL LOSS CFO BTM newnew accruals, fe cluster(firmid)
                          In which for the fourth regression the variables Ms20a, Ms40a, Ms60s and Ms80a represent the dummies for different levels of market share.

                          I have done the regression you describe above, but in a feedback session with my coach she declared that a logit regression was the right regression to run, because my dependent variable is in fact a dummy variable of the going-concern opinion (i.e. "1" if the auditor does give a going-concern opinion and "0" otherwise)

                          I don't mind to run the xtreg again, because that regression actually worked, but I thought that xtlogit, clogit or normal logit were the right regressions if I deal with a dummy variable as dependent variable?

                          Comment


                          • #14
                            Tessa:
                            if your data are in panel format, as Jeff pointed out, you should consider a panel data regression like -xtlogit-, keeping in mind all the caveats related to -fe-specification.
                            Kind regards,
                            Carlo
                            (Stata 19.0)

                            Comment


                            • #15
                              I do want to provie you with a screenshot of what happens whenever I run either a logit regression or a xtlogit. I tried different command, using fe nolog; i.year fe nolog; fe
                              But all the time, this is what happens:

                              Comment

                              Working...
                              X