Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Order variables from left to right based on their latest value

    I would like to order the variables (corresponding to the Stata command "order") from left to right based on the latest observation (here, the only observation of each variable that matters is the one corresponding to w_date=2022w46"). Here is the dataset:
    Code:
    clear all
    input str7 w_date cpd cpdpf cpdpm
    "2022w44" -.3522595 -.15837106 .45831277
    "2022w45" -.05552628 .00728419 .63357966
    "2022w46" -.04414876 .08082671 .65427268
    end

    In other words, I would like to automate the fact that I would like to order the variables as follow:
    Code:
    order w_date cpdpm cpdpf cpd
    To give a bit of context, I have 100+ variables that are cumulative returns. What matters in my context is the highest generated return at the latest date of the sample, in this example 2022w46. I would like to see the one that generate the highest returns from left to right on the screen to select the later

    Can you help me automate that?

  • #2
    Code:
    clear all
    input str7 w_date cpd cpdpf cpdpm
    "2022w44" -.3522595 -.15837106 .45831277
    "2022w45" -.05552628 .00728419 .63357966
    "2022w46" -.04414876 .08082671 .65427268
    end
    
    frame put w_date cpd*, into(find_order)
    frame find_order {
        isid w_date, sort
        collapse (lastnm) cpd*
        rename cpd* xcpd*
        gen `c(obs_t)' obs_no = _n
        reshape long x, i(obs_no) j(cpd_var) string
        gsort -x
        local wanted_order
        forvalues i = 1/`=_N' {
            local wanted_order `wanted_order' `=cpd_var[`i']'
        }
    }
    
    order w_date `wanted_order'
    Added: Out of laziness, I have relied on the fact that an alphabetic sort of the string variable w_date, in the format shown in your example, produces the correct chronological order. But really, it is always hazardous to use string variables to represent dates that you are going to actually perform operations with. It is better practice to first use the -weekly()- function to convert w_date into a Stata internal format weekly date variable, and use that converted variable instead of w_date in the above code.
    Last edited by Clyde Schechter; 23 May 2023, 18:48.

    Comment


    • #3
      Thanks for your response Clyde. I actually have a long list of variables that are not named cpd* but a list of global variables named $strat (I do not wish to change it to local, btw). I am having a hard time to deal with "rename". Also, in my real code, I have actual State weekly date. Would you be able to help me with this? Thanks

      Comment


      • #4
        I actually have a long list of variables that are not named cpd* but a list of global variables named $strat (I do not wish to change it to local, btw). I am having a hard time to deal with "rename".
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input float(w_date cpd cpdpf cpdpm)
        3267  -.3522595 -.15837106 .4583128
        3268 -.05552628  .00728419 .6335797
        3269 -.04414876   .0808267 .6542727
        end
        format %tw w_date
        
        global strat cpd cpdpf cpdpm
        
        frame put w_date cpd*, into(find_order)
        frame find_order {
            isid w_date, sort
            collapse (lastnm) cpd*
            rename ($strat) x=
            gen `c(obs_t)' obs_no = _n
            reshape long x, i(obs_no) j(cpd_var) string
            gsort -x
            local wanted_order
            forvalues i = 1/`=_N' {
                local wanted_order `wanted_order' `=cpd_var[`i']'
            }
        }
        
        order w_date `wanted_order'
        I do not wish to change it to local, btw
        I imagine that you included this remark because you know that I frequently point out that global macros are dangerous and should be avoided unless absolutely necessary. Moreover, as I have been working in Stata since 1994 and only once come across a situation where a local macro could not be used and there was no other mechanism to get around it, forcing me to use a global macro, I think such situations are very, very rare and I'd be inclined to bet that you do not have a good reason for your wish. Nevertheless, you are, of course, free to do as you please, at your own risk.

        Also, in my real code, I have actual State weekly date.
        You have over 700 Statalist posts. Even assuming you read nothing but your own threads here, that is a lot of Statalist experience, and I would have thought that you would know from experience that posting a data example that does not represent your actual data set can result in getting responses from others that don't work when you try to run them. In this case, it didn't matter. But in the future, it is in your best interests not to do things like that. Post an example that is actually like your data set--you will save yourself and others time and trouble by doing so.



        Comment


        • #5
          Thanks a lot Clyde, it really helps.

          As an extension of the previous code, I would like to drop variables for which the last value is below 0. I have tried the following but I am getting a syntax error. Can you help?

          Code:
          foreach var in `wanted_order' {
              drop `var' if (`var'[_N]<0)
          }

          Comment


          • #6
            #5 is illuminated by reading the help for drop.

            There are two quite distinct ways to use drop -- to drop variables or to drop observations. For the latter an if qualifier is essential (if no in qualifier is given) but for the former it makes no sense and is not allowed.

            What is allowed is an if command

            Code:
             
             if (`var'[_N]<0) drop `var'
            Although many Stata users have guessed wrongly that the two ways to use if are equivalent, they really aren't. The if command sets in train one and only one test, as in the example -- which if true causes the following code to be executed. In contrast the if qualifier follows the main command and sets in train a separate decision for each observation.

            Clyde Schechter and I (small world, this) have in press for Stata Journal 23(2) a longer piece expanding on the distinction, but that won't appear for some weeks.

            Talking of weeks: the use of string weekly dates here makes me wonder whether somehow you are using Stata weekly dates, somehow. On a par with Clyde's stern but principled admonition to avoid globals like the plague is my insistence -- in many posts here and in pieces in the Stata Journal -- that Stata's weeks are unlikely to be what anyone else wants. Stata's weeks come with the rules:

            1. Week 1 of any year starts on 1 January, always.

            2. There are 52 weeks in any year, always.

            3. Hence the last week of the year exceptionally includes 8 or 9 days, depending on the year being not leap or leap.

            I've never come across any data provider using those rules. Published weekly data are often based on the idea that a week starts on a certain day of the week or at least that data are associated with a particular day of the week. On such definitions, there can be 53 weeks in some years, and weeks may span two years, neither of which is allowed with Stata weekly dates.

            Comment

            Working...
            X