Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sorting and dropping

    Hi everyone,
    I have a dataset consisting of 10 numeric variables. I was wondering to know how can I sort the variables out and then drop out the first and last 50 observations. I would like to sort the variables independently such that sorting a variable does not affect the order of other variables.
    Thanks

  • #2
    Code:
    help sort
    help drop

    Comment


    • #3
      What you hope to do does not seem to make sense within the Stata context. In general the data for each observation are expected to be related to each other - variables measured for a single firm, for example, or for one city or individual.

      If you sort one variable while leaving the others unchanged, your observations will now have different data for the first variable.

      I think to get helpful advice you need to tell us more about your dataset, with sample data, as you have done on previous topics, and about the objective you have, rather than the steps you think you need to take to reach that objective.

      Comment


      • #4
        Thanks for your reply. My question may not seem to make sense within the STATA context but it perfectly makes sense for the purpose of my study. I have 2000 estimates yielded by my Bayesian biprobit model in r. Now I am trying to compute the 95% credible intervals. As such, the columns in my dataset (the Bayesian estimates) are not related to each other. To get the 95% credible intervals, I first need to sort each column independently and then drop out the first and last 50 observations. Hope this clarifies my question.

        Comment


        • #5
          Perhaps this example code will start you in a useful direction. Note that "drop" is not what is being done here - there are the same number of observations before and afterwards, which is not what happens when the drop command is used to eliminate observations. The code replaces the largest and smallest values with missing values so they are excluded from subsequent calculations.
          Code:
          // generate example data
          set obs 20
          set seed 666
          generate b1 = runiform(0,1)
          generate b2 = runiform(-50,50)
          
          // keep the middle 18 values for each variable
          local N_low  = 2
          local N_high = 19
          
          // keep track of the original order of the data
          generate seq = _n
          
          foreach v of varlist b1 b2 {
              sort `v'
              generate trim_`v' = `v'
              replace  trim_`v' = . if ! inrange(_n,`N_low',`N_high')
              }
          
          // put observations back in their original order
          sort seq
          
          summarize b1 trim_b1 b2 trim_b2
          list, clean noobs
          Code:
          . summarize b1 trim_b1 b2 trim_b2
          
              Variable |        Obs        Mean    Std. Dev.       Min        Max
          -------------+---------------------------------------------------------
                    b1 |         20    .4466837    .2603703   .0451471   .8836644
               trim_b1 |         18    .4447146     .234622    .066487   .8180724
                    b2 |         20   -.1864041    32.62759  -47.30211   48.69324
               trim_b2 |         18   -.2844008    30.30953  -44.44604    48.4127
          
          . list, clean noobs
          
                    b1          b2   seq    trim_b1     trim_b2  
              .1498635    3.418392     1   .1498635    3.418392  
              .4540368    36.74982     2   .4540368    36.74982  
              .2755941     48.4127     3   .2755941     48.4127  
               .066487   -5.858458     4    .066487   -5.858458  
              .3977311   -31.82873     5   .3977311   -31.82873  
              .5504283    29.30499     6   .5504283    29.30499  
              .2672185    7.094431     7   .2672185    7.094431  
              .3493752   -41.07987     8   .3493752   -41.07987  
              .8180724   -11.28583     9   .8180724   -11.28583  
              .8836644   -10.41394    10          .   -10.41394  
              .4745677    48.69324    11   .4745677           .  
              .2509398    -31.9016    12   .2509398    -31.9016  
              .7385626   -27.23103    13   .7385626   -27.23103  
              .0835627     24.4834    14   .0835627     24.4834  
              .0451471     9.97022    15          .     9.97022  
              .8038014   -44.44604    16   .8038014   -44.44604  
              .5108842   -31.55127    17   .5108842   -31.55127  
              .6510049    41.85792    18   .6510049    41.85792  
               .668859    29.18567    19    .668859    29.18567  
              .4938731   -47.30211    20   .4938731           .
          Last edited by William Lisowski; 16 Feb 2020, 11:24.

          Comment

          Working...
          X