Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to run out-of-sample test with clogit?

    Hi all,

    I have ran conditional logistic regression with my estimation sample (data from 2001-2014) and based on it selected the most relevant and least multicollinear variables. Now I want to run conditional logistic regression within my prediction sample (2015-2018). Therefore, I want to have all the same coefficients generated from the model when running it within estimation sample as I am now running it within the prediction sample.

    How do I save these coefficients and run the regression again with different(prediction sample) dataset?

    Estimation sample and prediction sample are two different excel data sets, they are not combined.

    Thanks in advance

    BR
    Max Immonen

  • #2
    -help estimates store-
    -help estimates restore-
    -help estimates save-
    -help estimates use-

    For predicted probabilities pc1 or pu0, or for xb, -predict- will provide out-of-sample predictions.

    Also, if it makes sense in other respects, you could -append- the two data sets together before doing the original conditional logistic regression. In that case, you wouldn't even need to store or save the estimates. Just use -if inrange(year, 2001, 2014)- for the estimation sample regression, and then apply -predict- with no -if- clause.

    Comment


    • #3
      Hi Clyde,

      Just to be sure I did this right way I copied the commands from STATA below.

      1) I imported the estimation sample to STATA
      2) I ran clogit within the estimation sample
      3) I used ''estimates store'' command
      4) I imported the prediction sample to STATA
      5) I used ''estimates restore'' command
      6) I used ''predict, pc1'' command

      So now I should have the takeover probabilities calculated with coefficients from estimation sample but within prediction sample data?

      Thank you very much!



      import excel "/Users/omistaja/Desktop/STATA.READY.DATA.xlsx", sheet("Estimation sample") firstrow clear

      . clogit Target1NonTarget0 LNMARKETCAP LNEV LNNETSALES LNTOTALASSETS PE PB PS SalesGrowth ROE ProfitMargin EBITDAMargin Operating
      > CashFlowMargin AssetTurnover Gearing CurrentRatio Liquidity InvestmentBehavior Leverage IndustrydisturbanceDUMMY GRDUMMY, group
      > (YEARSIC3)
      note: multiple positive outcomes within groups encountered.
      note: 2,497 groups (8,780 obs) dropped because of all positive or
      all negative outcomes.

      Iteration 0: log likelihood = -873.87402
      Iteration 1: log likelihood = -856.33418
      Iteration 2: log likelihood = -856.26146
      Iteration 3: log likelihood = -856.26145

      Conditional (fixed-effects) logistic regression

      Number of obs = 5,015
      LR chi2(20) = 582.64
      Prob > chi2 = 0.0000
      Log likelihood = -856.26145 Pseudo R2 = 0.2539

      ------------------------------------------------------------------------------------------
      Target1NonTarget0 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
      -------------------------+----------------------------------------------------------------
      LNMARKETCAP | -1.447227 .1717374 -8.43 0.000 -1.783826 -1.110628
      LNEV | 2.433571 .1948839 12.49 0.000 2.051606 2.815536
      LNNETSALES | -1.446167 .228276 -6.34 0.000 -1.89358 -.9987539
      LNTOTALASSETS | .6060187 .2335801 2.59 0.009 .1482101 1.063827
      PE | .0017869 .0011036 1.62 0.105 -.0003761 .0039499
      PB | -.0015023 .022008 -0.07 0.946 -.0446372 .0416327
      PS | -.0242994 .0337597 -0.72 0.472 -.0904671 .0418684
      SalesGrowth | -.0264502 .0404047 -0.65 0.513 -.1056419 .0527416
      ROE | .0087524 .199407 0.04 0.965 -.3820782 .3995831
      ProfitMargin | 1.229701 .6304049 1.95 0.051 -.0058699 2.465272
      EBITDAMargin | -1.621412 .4868985 -3.33 0.001 -2.575716 -.6671084
      OperatingCashFlowMargin | .3656425 .3313136 1.10 0.270 -.2837202 1.015005
      AssetTurnover | .9327297 .1788173 5.22 0.000 .5822542 1.283205
      Gearing | -.1243681 .0518317 -2.40 0.016 -.2259563 -.0227799
      CurrentRatio | -.4329885 .0678735 -6.38 0.000 -.5660181 -.299959
      Liquidity | 6.982205 .5560583 12.56 0.000 5.89235 8.072059
      InvestmentBehavior | .4625813 .5579098 0.83 0.407 -.6309019 1.556065
      Leverage | -1.910667 .5446221 -3.51 0.000 -2.978107 -.8432278
      IndustrydisturbanceDUMMY | .5266789 .1982566 2.66 0.008 .1381032 .9152547
      GRDUMMY | .1721919 .1326558 1.30 0.194 -.0878088 .4321925
      ------------------------------------------------------------------------------------------

      . estimates store
      name required
      r(100);

      . estimates store esamplecoeffi

      .
      . import excel "/Users/omistaja/Desktop/STATA.READY.DATA.xlsx", sheet("Prediction sample") firstrow clear

      . estimates restore esamplecoeffi
      (results esamplecoeffi are active now)

      . estimates dir esamplecoeffi

      -------------------------------------------------------
      name | command depvar npar title
      -------------+-----------------------------------------
      esamplecoe~i | clogit Target1Non~0 20
      -------------------------------------------------------

      . predict pc1
      (option pc1 assumed; probability of success given one success within group)
      (1 missing value generated)

      . brow

      . rename pc1 takeoverprobability

      Comment


      • #4
        Yes, except for your glitch with -estimates store- the first time, which you immediately corrected, this looks right to me.

        Comment

        Working...
        X