Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Logistic regression difficulties

    I am using Stata 15.1 and am struggling to conduct a logistic regression.

    I am using financial data that has 1912 observations and in my regression I am using 34 independent variables (14 numeric and 20 dummies). Upon importing the data I used destring for the numerical independent variables and subsequently generated my binary dummies from categorical variables.

    I have no issues with the setting up of the regression but when I come to use logistic Stata continually denies my command, reporting insufficient / no observations.

    I looked at past online queries regarding this and the replies always seemed to be surrounding destring / encode but neither work in this situation. Hoping someone has the answer?

    Many thanks in advance,
    Caleb
    Attached Files
    Last edited by Caleb Hall-Paterson; 17 Apr 2020, 14:56. Reason: Attached is my Do-file which flows through perfectly until the regression

  • #2
    Off hand, this sounds like it is a problem with your data, rather than with your code. But you don't show anything about your data. Please post back using the -dataex- command to show an example of your Stata data set. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Also, please read the Forum FAQ for excellent advice on the most effect way to show information. In particular, attachments are discouraged. The most helpful way to show code is to copy it to your clipboard and paste it into the forum between code delimiters. FAQ #12 explains code delimiters.

    Comment


    • #3
      My data looks a little like this. I have chosen three of the variables which more broadly represent my dataset;
      - LoanTrancheSizeMM represents variables that are already in numeric form when the data is imported to Stata, as can be seen from my Do-file I do not alter these
      - AverageLife is one of many variables I destring
      - IsSponsorLed is one example of a categorical variable I later make a dummy out of

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      dataex LoanTrancheSizeMM AverageLife IsSponsorLed in 1/10
      clear
      input double LoanTrancheSizeMM str17 AverageLife str3 IsSponsorLed
          30 "--"    "No" 
         170 "3.778" "Yes"
         225 "4.225" "Yes"
      440.43 "6.268" "Yes"
      242.88 "7.739" "Yes"
          10 "3.694" "No" 
        48.5 "1.219" "No" 
          10 "1.189" "No" 
         5.8 "--"    "No" 
         200 ".175"  "Yes"
      end
      Please let me know if you require a more comprehensive snapshot of my data, and I will happy to provide it. I appreciate the advice on using the Forum, I am new to it all!

      Thanks

      Comment


      • #4
        I'm sorry I wasn't clearer about my request to see example data. I would like to see the example data as it appears at the point where you try to run the regression, and it should include all variables mentioned in the regression command. Also, please show the regression command itself.

        Comment


        • #5
          Following your request, I attempted to show all data through the -dataex- command although it returned 'input statement exceeds linesize limit. Try specifying fewer variables'

          As such, I only included one of the nine industry dummy variables and had to remove a few of the other dummies / numerical variables that I consider less integral to my analysis. These variables that are not displayed in the data can still be found in the regression command at the bottom of the page.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          dataex AmtRecovered LoanTrancheSizeMM OriginalMaturity AverageLife_n Coupon_n InterestAccrued_n Revenue_n TotalAssets_n TotalLiabilities_n CreditSpread_n PrepaymentY ICommsY FlexStatusD InterestMonY GuaranteedY SponsorLedY in 1/50
          clear
          input double(AmtRecovered LoanTrancheSizeMM OriginalMaturity AverageLife_n Coupon_n InterestAccrued_n Revenue_n TotalAssets_n TotalLiabilities_n CreditSpread_n) float(PrepaymentY ICommsY FlexStatusD InterestMonY GuaranteedY SponsorLedY)
                           0     30     7     . 17.254    .      .       .       .   900 0 1 0 . 0 0
                           0    170  7.99 3.778      .  .03      .       .       .     . 0 0 0 . 0 1
                           0    225     8 4.225      .  .14      .       .       .     . 0 0 0 . 0 1
                           0 440.43  8.11 6.268      .  .03   8.11 1781.54 1544.48     . 0 0 0 . 1 1
                           0 242.88     8 7.739      .  .21      .       .       .     . 0 0 0 . 1 1
                           0     10    .5 3.694  4.734  .51      .       .       .   300 0 0 0 . 0 0
                           0   48.5  8.65 1.219  5.963  .28      .  211.42  283.43   450 0 1 0 . 0 0
                           0     10     3 1.189  4.445  .33   3.65  502.59  405.78   275 0 0 0 . 0 0
                           0    5.8     3     .  3.984 1.01   3.65  502.59  405.78   375 0 0 0 . 0 0
                           0    200 13.15  .175  3.195  .24      .       .       .   150 0 0 0 . 0 1
                           0  33.22  6.74 2.058  4.367  .96 156.94   53.61   74.05   475 0 0 0 . 0 0
                           0     75     5 1.047  5.195 1.15      .       .       .   325 0 0 0 . 0 0
                           0    125     6   .35 10.435 2.52      .       .       .   850 0 0 0 . 0 1
                           0    100   6.5 4.958  9.692  .83      .       .       .   850 0 0 0 . 0 1
                           0     25 34.52  .747  6.593 1.14      .       .       .   475 0 0 0 . 0 0
                           0     30     5 3.328  5.624 1.19      .       .       . 387.5 0 0 0 . 0 0
                           0    280     7 6.269  6.679  .45      .       .       . 537.5 0 0 0 . 0 1
                           0   13.6     3     .  3.232    .   3.65  502.59  405.78   300 0 0 0 . 0 0
                           0    6.5   .33     .      .  .04      .       .       .     . 0 1 0 . 0 0
          .01777777777777782    270   5.5 4.764  2.963  .01      .       .       .   200 0 0 0 . 0 0
          .02833333333333328    210     6     .  3.324  .83      .       .       .   275 0 0 0 . 0 0
          .04210526315789478     19  1.98     .  2.805  .05   3.65  502.59  405.78   250 0 0 0 . 0 0
          .04356391598614592 629.42   9.5 1.714  5.408 1.11   1324    4181    4780   175 0 0 0 . 0 1
          .07815220731332578  97.63     6  .806  4.608  .97      .       .       .   225 0 0 0 . 0 1
          .21819474058280022  140.7  4.25 2.414  8.001  .09      .       .       .   700 0 0 0 . 0 0
           .2918181818181818     11   7.9     .  3.468  .01      .       .       .   300 0 0 0 . 0 0
           .3946731234866828  123.9     7 2.167  5.992  .67 418.67  226.89  251.08   400 0 1 0 . 0 0
                        .404    250  4.51 2.319  3.695  .52      .       .       .   175 0 0 0 . 0 0
           .4262068965517241    145 12.81 2.972  5.658  .72  30.53  217.15  175.86   375 0 0 0 . 0 0
                       .4286      1    10     .  8.295 2.14      .       .       .   300 0 0 0 . 0 0
                         .44     25   5.5     .  3.859  .09      .       .       .   350 0 0 0 . 0 0
           .4714285714285714    350  10.5 3.642  4.908   .6      .       .       .   425 0 0 0 . 0 1
                          .5  96.61   6.5 4.967      .  .05      .       .       .     . 0 0 0 . 0 1
                          .5 120.63  8.33 6.345      .  .04      .       .       .     . 0 0 0 . 0 1
                          .5  36.86  7.33 5.345      .   .3      .       .       .     . 0 0 0 . 0 1
           .5333592686247812 102.82     6  .167  2.867  .31      .       .       .   325 0 0 0 . 0 1
                        .575      8     6     .     13  .62      .       .       .   425 0 1 0 . 0 0
           .5955555555555555      9     5     .   5.58  .48   43.7   30.83   12.04   558 0 0 0 . 0 0
                          .6  68.36     8 6.736      .   .2      .       .       .     . 0 0 0 . 1 1
                          .6     35     6 1.581  8.872  .76      .       .       .   700 0 1 0 . 0 0
                      .64406   1000 13.39 2.942  4.658  .44 297.05  1085.4 930.952   325 0 0 0 . 0 1
                        .684     25 10.25 2.958  4.408  .33 953.71 2819.96 2811.66   250 0 0 0 . 0 0
           .6994871794871795   97.5 10.17     .  3.057  .16 186.13  1011.3  373.58   243 0 0 0 . 0 0
                        .752     60     4     .  3.276    .      .       .       .   300 0 0 0 . 0 0
           .7533333333333333    150     7 3.583  3.915  .08      .       .       .   240 0 0 0 . 0 0
                         .84   1000     5 3.081  3.158  .13      .       .       .   400 0 0 0 . 0 1
           .8533333333333334    750     8 2.806  4.685  .27      .       .       .   350 0 0 0 . 0 1
           .8633333333333333     30     4     .  3.236   .2      .       .       .   300 0 0 0 . 0 0
                        .868    500     4 1.358  4.319  .17  79.56  195.68  177.45   325 0 0 0 . 0 1
           .8733333333333333    750  8.38 3.781  2.751  .02      .       .       .   175 0 0 0 . 0 1
          end
          The regression itself is;
          Code:
          logistic AccNoRecoveryCL LoanTrancheSizeMM OriginalMaturity AssignmentFee_n AverageLife_n Coupon_n GracePeriod_n InterestAccrued_n LoanCommitmentFee_n Revenue_n TotalAssets_n TotalLiabilities_n TotalLevRatio_n NetSeniorLevRatio_n CreditSpread_n PrepaymentY ICommsY IConsDY IConsSY IEnergyY IFinY IHealthY IMatsY ITechY IUtilY FlexStatusU FlexStatusD HasAmortisationTableY HardCallY SoftCallY InterestMonY GuaranteedY SponsorLedY AmendedY DiscountSecurityY
          Many thanks,
          Caleb

          Comment


          • #6
            Well, I get a different error message from what you said you got:

            Code:
            . logistic AccNoRecoveryCL LoanTrancheSizeMM OriginalMaturity AssignmentFee_n AverageLife_n Coupon_n GracePeriod_n InterestAccrued_n LoanCommitmentFee_n Revenue_n TotalAsse
            > ts_n TotalLiabilities_n TotalLevRatio_n NetSeniorLevRatio_n CreditSpread_n PrepaymentY ICommsY IConsDY IConsSY IEnergyY IFinY IHealthY IMatsY ITechY IUtilY FlexStatusU Fl
            > exStatusD HasAmortisationTableY HardCallY SoftCallY InterestMonY GuaranteedY SponsorLedY AmendedY DiscountSecurityY
            variable AccNoRecoveryCL not found
            And indeed, you show no variable by that name.

            That was an unfortunate omission.

            Well, being unable to troubleshoot the actual code on a representative sample of the data, here's my best guess. Some of your variables have many observations with missing values. This may be true of the variables you did not show, as well. My best guess is that when all of the variables are taken into account, there are no observations that have all non-missing values. Remember that an observation can only be included in an estimation sample if it has non-missing values on every variable mentioned in the regression command. If even just one variable has a missing value, that observation is excluded. I suspect the distribution of missing values around the data is such that all the observations are excluded. (Actually, even in the limited set of variables you show, I noticed that InterestMonY has no non-missing values in the example data: if that's true throughout your data set, then this variable alone kills the estimation.)
            Last edited by Clyde Schechter; 17 Apr 2020, 18:25.

            Comment

            Working...
            X