Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predict residual error after oprobit regression

    Hello,
    We are trying to generate the residuals after running an oprobit regression. Our dataset consists of 202 towns and our dependent variable is the number of tire dealers, which can be 0,1,2,3,4 or 5.
    The regression we are running is the following:
    oprobit N_tire ln_Sm eld pinc lnhdd ffrac landv, robust

    Where N_tire is the number of tire dealers. ln_Sm is the logarithm of the market size. eld is the fraction of old people in the population. pinc is the per capita income. lnhdd is the logarithm of heating degree days. ffrac is the fraction of land in farms and landv is the value per acre of land.

    The output of the regression is:
    Ordered probit regression
    N_tire Coef. St.Err. t-value p-value [95% Conf Interval] Sig
    ln_Sm 1.271 0.122 10.44 0.000 1.033 1.510 ***
    eld -2.957 1.823 -1.62 0.105 -6.530 0.617
    pinc 0.049 0.076 0.64 0.524 -0.101 0.198
    lnhdd 0.041 0.202 0.20 0.840 -0.355 0.437
    ffrac 0.089 0.261 0.34 0.733 -0.422 0.600
    landv -0.129 0.470 -0.28 0.783 -1.051 0.792
    cut1 0.259 1.854 .b .b -3.374 3.893
    cut2 1.142 1.873 .b .b -2.530 4.814
    cut3 1.957 1.880 .b .b -1.728 5.641
    cut4 2.535 1.879 .b .b -1.147 6.217
    cut5 2.938 1.875 .b .b -0.737 6.613
    Mean dependent var 2.233 SD dependent var 1.815
    Pseudo r-squared 0.255 Number of obs 202.000
    Chi-square 143.725 Prob > chi2 0.000
    Akaike crit. (AIC) 541.263 Bayesian crit. (BIC) 577.654
    *** p<0.01, ** p<0.05, * p<0.1
    To generate the residuals of this regression we use the following command right after this regression:
    predict uhat, resid

    However, if we do this, we get an error message saying option resid not allowed r(198);

    We need to get the residuals of this regression in order to be able to get the standard deviation of the error term. We don't know how to fix this error. Please let us know if you know what we are doing wrong. Thank you very much.
    Last edited by sladmin; 29 Jan 2020, 08:01. Reason: Anonymize original poster

  • #2
    Why are you trying to fit an ordered-probit regression model to count data?

    Comment


    • #3
      The assignment given out by my university asks us to do the following:
      - Use the ordered probit method to estimate (Stata command: oprobit) the model and briefly interpret. Indicate which output reflects the number of firm fixed effects α(n)*.
      From this output we will get beta*, alpha(n+1)* and gamma* , which are defined as beta/sigma, alpha(n+1)/sigma and gamma/sigma
      - Calculate the entry thresholds and entry threshold ratios for a “representative market”. A representative market is a market with mean values for all independent variables. Provide an economic interpretation for your results.
      The entry thresholds are calculated by using the real beta, alpha and gamma. Thus we need to find sigma and sigma is the standard deviation of the error term. The error term is i.i.d. N~(0,sigma^2).

      That is why we are trying to generate the residuals.

      Right now I have solved it in the following way (below is what we have put in our do-file):
      *Ordered Probit Regression*
      oprobit N_tire ln_Sm eld pinc lnhdd ffrac landv, robust
      *Linear prediction of number of tire dealers*
      predict Ntire, xb
      *Generating residual as actual number of tire dealers minus the linear prediction of number of tire dealers*
      gen residual = N_tire - Ntire
      *Generating sigma, the standard deviation of the residual*
      egen sigma = sd(residual)

      Comment


      • #4
        I line up with Joseph Coveney here.

        The fact that residuals are not accessible directly after oprobit is a subtle hint that they are dubious. The point of ordered probit is that the response is ordered but not necessarily counted or measured and the fitting just treats the categories as ordered. Residuals as observed MINUS predicted may seem easy to calculate, but that is not a metric that the fitting method respects.

        I can't see a reason why the number of tire dealers is limited to 5 in a town; the observed upper limit is just some side-effect of town sizes. That being so, Poisson regression is to me a more obvious method. This may be a ragbag dataset, however, and your teachers more concerned to show technique....

        For our policy on help with assignments please do study #4 of https://www.statalist.org/forums/help#adviceextras

        Comment


        • #5
          Okay. No more answered needed. Please do not reply to this post anymore.

          Comment

          Working...
          X