Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transforming Ordinal Variable to Continuous Variable

    I have an ordinal variable having the categories 0, 1, 2, 3, 4. The categories are based on an unknown distribution of an unknown continuous variable. "0" represents that the average value of the unknown variable lies below the 80th percentile. "1" represents that the average value lies between the 80th and 90th percentile; "2" represents the average value lies between the 90th and 94th percentile; "3" represents the average value between 94th and 96th percentile while the last category "4" represents average value above the 96th percentile.

    This variable is at daily frequency for each firm. Thus, the firm identifier makes the panel variable and the date makes the time variable. Since these categories only look at the right tail of the distribution, most of the values in the data are "0".

    I want to transform this variable into a continuous variable using the conditional means of the truncated normal distribution. How is this supposed to be done in Stata? I am using Stata 12.1. The variable in Stata looks like as follows. The firm identifier is "permno", "date" is the time variable while "inst_atten" is the ordinal variable that I want to transform.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double permno int date float inst_atten
    10026 18263 0
    10026 18266 0
    10026 18267 1
    10026 18268 0
    10026 18269 0
    10026 18270 0
    10026 18273 0
    10026 18274 2
    10026 18275 0
    10026 18276 0
    10026 18277 3
    10026 18280 4
    10026 18281 0
    10026 18282 0
    10026 18283 0
    10026 18284 0
    10026 18287 0
    10026 18288 0
    10026 18289 1
    10026 18290 1
    10026 18291 0
    10026 18294 0
    10026 18295 0
    10026 18296 0
    10026 18297 0
    10026 18298 0
    10026 18301 0
    10026 18302 0
    10026 18303 0
    end
    format %tdnn/dd/CCYY date


  • #2
    I suggest looking at the truncreg command and its postestimation predict options. My guess is that either the e(....) or the ystar(...) prediction is what you are hoping to obtain.

    The only thing I would add is to be sure you want the conditional mean of a truncated normal and not a censored normal. Since most of your outcomes are zeros it is unclear to me whether the truncated normal is most appropriate.

    Comment


    • #3
      Thank you John for your suggestion. I want to add that my ordinal variable is NOT to be used as a DEPENDENT variable. It will be used as a regressor. Furthermore, I want the transformed continuous variable just a robustness check.

      Comment


      • #4
        If I'm interpreting your question correctly (I'm really not sure I am) would something like this give you what you want:
        Code:
        tempvar lb
        tempvar ub
        
        gen `lb'=.
        gen `ub'=.
        
        replace `lb'=. if inst_atten==0
        replace `ub'=invnormal(.8) if inst_atten==0
        
        replace `lb'=invnormal(.8) if inst_atten==1
        replace `ub'=invnormal(.9) if inst_atten==1
        
        replace `lb'=invnormal(.9) if inst_atten==2
        replace `ub'=invnormal(.94) if inst_atten==2
        
        replace `lb'=invnormal(.94) if inst_atten==3
        replace `ub'=invnormal(.96) if inst_atten==3
        
        replace `lb'=invnormal(.96) if inst_atten==4
        replace `ub'=. if inst_atten==4
        
        intreg `lb' `ub'
        
        predict normalvar, e(`lb',`ub')

        Comment


        • #5
          John, your interpretation of what I wanted was spot on. So you first define the upper bounds and lower bounds for each category of the ordinal variable "inst_atten" and then use interval regression with only a constant (and no regressor) to predict the fitted values.

          Thank you again.

          Comment

          Working...
          X