Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Probit model constant term

    Hey
    I have a sample with different start ups, the funding amount, and three different investor types (A,B,C). I run a probit model to investigate if the investor type has an influence on the survival rate (maybe later I will work with a hazard regression but for the beginning probit is okay).
    Investor type and survival are dummy variables which equals one if a specific investor invest in the startup and if the startup survive.

    So I use:
    probit survive investorA investorB ln(funding_amount)

    I have significant negative coefficients for A and B and a significant positive coefficient for the funding amount. So, I would say investor C is the best in case of survival. But I also have a significant negative constant. Does this mean everything depends on the funding amount because the term would be negative if A and B are zero?

    On the other hand, if I do:
    probit survive investorC ln(funding_amount)
    the coefficient for C is significant and positive.

    So how could I interpret the constant term? And could I say C is the best investor?

    Thanks for your help

    Kind regards,
    Alex
    Last edited by Alex Beck; 10 Jul 2019, 06:54.

  • #2
    Alex:
    in order to have more readable results, I think that you should recode your regression model a bit, replacing dummy varianbles for investors A and B with an unique variable that can benefit from -fvvarlist- notation:
    Code:
    set obs 10
    g investors_A=1 in 1/3
    g investors_B=1 in 4/6
    g investors_C=1 in 7/10
    g investors=0 if investors_A==1
    replace investors=1 if investors_B==1
    replace investors=2 if investors_C==1
    label define investors 0 "investors_A" 1 "investors_B" 2 "investors_C"
    label val investors investors
    . list
    
         +----------------------------------------------+
         | invest~A   invest~B   invest~C     investors |
         |----------------------------------------------|
      1. |        1          .          .   investors_A |
      2. |        1          .          .   investors_A |
      3. |        1          .          .   investors_A |
      4. |        .          1          .   investors_B |
      5. |        .          1          .   investors_B |
         |----------------------------------------------|
      6. |        .          1          .   investors_B |
      7. |        .          .          1   investors_C |
      8. |        .          .          1   investors_C |
      9. |        .          .          1   investors_C |
     10. |        .          .          1   investors_C |
         +----------------------------------------------+
    
    . drop investors_*
    
    . list
    
         +-------------+
         |   investors |
         |-------------|
      1. | investors_A |
      2. | investors_A |
      3. | investors_A |
      4. | investors_B |
      5. | investors_B |
         |-------------|
      6. | investors_B |
      7. | investors_C |
      8. | investors_C |
      9. | investors_C |
     10. | investors_C |
         +-------------+
    
    .
    
    probit survive investors funding_amount
    I do not consider the significant results of your last -probit- code maningful.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Hey,

      Thanks for your response Carlo

      So in my case I did:

      sort investorA investorB investorC

      g investors_A=1 in 1/591
      g investors_B=1 in 592/1201
      g investors_C=1 in 1202/1600

      then I followed your recommondation,
      Now I have a significant negative coefficient for investors and significant postive for funding_amount?

      So I could conclude the main driver for survive is just the funding amount and the investor types doesnt differ at all?

      Comment


      • #4
        This would be easier to discuss if you showed your actual code and output using code tags. See point 12 of the FAQ.

        I think Carlo forgot to use factor variable notation in his probit command.

        if you get a significant negative coefficient for investors, why would you conclude that investor type doesn’t matter?
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Yes sure

          Code:
          input float(survive investor_A investor_B investor_C funding_amount)
          0 0 1 0  8.043984
          0 1 0 0 9.2103405
          1 1 0 0  9.433484
          0 0 0 1  9.483036
          0 1 0 0  9.562334
          0 1 0 0  9.709721
          1 0 1 0  9.711116
          0 1 0 0  9.725317
          0 0 0 1  9.725317
          0 1 0 0  9.725317
          end
          
          probit survive i.investor funding_amount
          
          --------------------------------------------------------------------------------
                 survive |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
          ---------------+----------------------------------------------------------------
               investors |
            investors_B  |  -.1547946   .1085354    -1.43   0.154      -.36752    .0579309
            investors_C  |  -.8740344    .302156    -2.89   0.004    -1.466249   -.2818195
                         |
          funding_amount |    .134063   .0233468     5.74   0.000     .0883042    .1798218
                   _cons |  -3.045053   .3601424    -8.46   0.000     -3.75092   -2.339187
          --------------------------------------------------------------------------------
          So I wouId interpret C lowers the probability to survive and B has no significant effect (there is one if I include some more controll variables).
          Could I say somehting about A in this case?

          Kind regards
          Alex

          Comment


          • #6
            Everything is relative to Investor A. Think of its coefficient as being 0.
            -------------------------------------------
            Richard Williams, Notre Dame Dept of Sociology
            Stata Version: 17.0 MP (2 processor)

            EMAIL: [email protected]
            WWW: https://www3.nd.edu/~rwilliam

            Comment


            • #7
              Okay.
              So it would be right to say C is worse than A in case of survival?

              Comment


              • #8
                Richard got me in #2 (Thanks).
                I should have written:
                Code:
                probit survive i.investors funding_amount
                Last edited by Carlo Lazzaro; 11 Jul 2019, 01:18.
                Kind regards,
                Carlo
                (Stata 18.0 SE)

                Comment


                • #9
                  Alex:
                  you may want to consider -test- for -probit- postestimation:
                  Code:
                  . input float(survive investor_A investor_B investor_C funding_amount)
                  
                         survive  investo~A  investo~B  investo~C  funding~t
                    1.
                  . 0 0 1 0  8.043984
                    2.
                  . 0 1 0 0 9.2103405
                    3.
                  . 1 1 0 0  9.433484
                    4.
                  . 0 0 0 1  9.483036
                    5.
                  . 0 1 0 0  9.562334
                    6.
                  . 0 1 0 0  9.709721
                    7.
                  . 1 0 1 0  9.711116
                    8.
                  . 0 1 0 0  9.725317
                    9.
                  . 0 0 0 1  9.725317
                   10.
                  . 0 1 0 0  9.725317
                   11.
                  . end
                  
                  . g investors=0 if investor_A==1
                  (4 missing values generated)
                  
                  . replace investors=1 if investor_B==1
                  (2 real changes made)
                  
                  . replace investors=2 if investor_C==1
                  (2 real changes made)
                  
                  . label define investors 0 "investors_A" 1 "investors_B" 2 "investors_C"
                  
                  . label val investors investors
                  
                  . probit survive i.investors funding_amount
                  
                  note: 2.investors != 0 predicts failure perfectly
                        2.investors dropped and 2 obs not used
                  
                  Iteration 0:   log likelihood = -4.4986812
                  Iteration 1:   log likelihood =  -3.373829
                  Iteration 2:   log likelihood = -3.3653693
                  Iteration 3:   log likelihood = -3.3653647
                  Iteration 4:   log likelihood = -3.3653647
                  
                  Probit regression                               Number of obs     =          8
                                                                  LR chi2(2)        =       2.27
                                                                  Prob > chi2       =     0.3220
                  Log likelihood = -3.3653647                     Pseudo R2         =     0.2519
                  
                  --------------------------------------------------------------------------------
                         survive |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                  ---------------+----------------------------------------------------------------
                       investors |
                    investors_B  |   1.969227   1.725877     1.14   0.254     -1.41343    5.351885
                    investors_C  |          0  (empty)
                                 |
                  funding_amount |   1.474676   1.418464     1.04   0.299    -1.305462    4.254814
                           _cons |  -15.06073   13.59021    -1.11   0.268    -41.69705    11.57558
                  --------------------------------------------------------------------------------
                  
                  
                  . matlist e(b)
                  
                               | survive                                              
                               |        0b.         1.        2o.                    
                               | investors  investors  investors  funding~t      _cons
                  -------------+-------------------------------------------------------
                            y1 |         0   1.969227          0   1.474676  -15.06073
                  
                  . test 2.investors- _cons=0
                  
                   ( 1)  [survive]2o.investors - [survive]_cons = 0
                  
                             chi2(  1) =    1.23
                           Prob > chi2 =    0.2678
                  
                  
                  . test 1.investors- _cons=0
                  
                   ( 1)  [survive]1.investors - [survive]_cons = 0
                  
                             chi2(  1) =    1.35
                           Prob > chi2 =    0.2456
                  
                  
                  .
                  Kind regards,
                  Carlo
                  (Stata 18.0 SE)

                  Comment


                  • #10
                    Thanks for your answer Carlo.

                    I did like you recommended but I am not sure how to interpret my results.

                    Code:
                    . matlist e(b)
                    
                                 | survive                                              
                                 |        0b.         1.         2.                      
                                 | investors  investors  investors  funding~t      _cons
                    -------------+-------------------------------------------------------
                              y1 |         0  -.1547946  -.8740344    .134063  -3.045053
                    
                    . test 2.investors- _cons=0
                    
                     ( 1)  [survive]2.investors - [survive]_cons = 0
                    
                               chi2(  1) =   18.18
                             Prob > chi2 =    0.0000
                    
                    . test 1.investors- _cons=0
                    
                     ( 1)  [survive]1.investors - [survive]_cons = 0
                    
                               chi2(  1) =   55.65
                             Prob > chi2 =    0.0000
                    
                    . test 0.investors- _cons=0
                    
                     ( 1)  [survive]0b.investors - [survive]_cons = 0
                    
                               chi2(  1) =   71.49
                             Prob > chi2 =    0.0000

                    Comment


                    • #11
                      Here's my two cents. First, I don't think there was anything wrong with how you coded the dummy variables in the first place. It does not matter which group you choose to be the base group. I'll assume it is group C, as you started with. So you include dummies for investors_A and investors_B. As was noted, these give the coefficients are the differences in the intercepts relative to group C. So, if those coefficients are both negative, group C is, in fact, the most successful in terms of survivorship.

                      But what hasn't been answered yet is what to make of the negative intercept. There are two issues. First, remember that what makes sense, maybe, is to compute PHI(_b[_cons]) as this would give the probability of surviving for group C when ln_funding_amount is equal to zero. The fact that the constant is negative simply means that the survival probability is estimated to be less than 0.5.

                      But the more important issue is to realize PHI(_b[_cons]) is for ln_funding_amount = 0. Is this an interesting value? It means the funding amount would be $1 (or whatever is the currency). Does it appear in your sample? Just like in a linear model, the intercept in a probit by itself might not be meaningful because it sets all the covariates to zero. If you run a standard wage equation, the intercept is often zero because it makes little sense to set education, age, experience, and so on to zero. I suspect the same is true in your case.

                      If you want to see what the estimated survival probabilities are on average for the three groups, do this. And also obtain the average partial effects.

                      Code:
                      probit survive i.investorA i.investorB ln_funding_amount
                      margins i.investorA i.investorB
                      margins, dydx(investorA investorB)
                      At least the above will average out the probability across the funding variable.

                      JW

                      Comment


                      • #12
                        Okay
                        Thank you all for your help. I think I have a better understanding now!

                        Kind regards,
                        Alex

                        Comment

                        Working...
                        X