Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transforming the variable 15th to 80th percentile

    I want to transform the variable to only include values from 15th to 80th percentile of that data. What is the code for that?

    Any help shall be highly appreciated!

  • #2
    What do you mean by "transforming a variable to include a part of the data set"? Do you mean that you want to keep or drop part of the data according to the percentile value of a particular variable?

    Comment


    • #3
      If I understand you correctly you want to drop the cases with lowest 15% and highest 20% of the values. If you want to do this "automatically", i.e. if you don't want to have a look at the optimal thresholds for cutting off the lowest 15 and highest 20 percent but want to let the program decide, I would recommend using the .ado program -dichoct- (on SSC) to find the optimal thresholds for cutting off the lowest 15% and highest 80% (an "optimal threshold" produces the least error by determining whether to include or exclude values at the threshold). An example using Stata's famous auto data set:
      Code:
      cap which dichoct           // check whether -dichoct- is already installed
      if _rc ssc install dichoct  // install -dichoct- if necessary
      
      sysuse auto, clear
      tab1 length                 // you would look at the cum. perc. to do it by hand
      
      dichoct length, centile(15) gen(length_15)   // optimal split at lowest 15%
      dichoct length, centile(80) gen(length_80)   // optimal split at highest 20%
      tab2 length_15 length_80, col row            // show optimal splits
      
      gen length_15_80 = length if length_15==1 & length_80==0  // length betw. 15% and 80%
      sum length length_15_80     // compare descriptive statistics of old and new variable
      Last edited by Dirk Enzmann; 18 Apr 2021, 10:28.

      Comment

      Working...
      X