Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating standardized variables per month for multiple variables

    Good afternoon,

    I have a question concerning standardization per month for multiple variables. I am using panel data for a financial asset pricing model, and what I would like to do is create standardized values for a number of variables (ME, BTM, etc.) per month. Meaning I would like to standardize the variables ME, BTM, etc. for january 1963, february 1963, etc., all the way up to december 2013. The time variable is simply called 'time' and has the following format: YYYYmM (e.g. 1962m7).

    I have tried to solve this issue by using the egen command -by time: egen std_ME = std(log_ME)-, however this gives the following error: "egen ... std() may not be combined with by". Furthermore, even if this did work, I would have to manually repeat the process for all variables (which however is not a very big issue, thanks to the relatively low number of variables needed to standardize).

    Also, I have found the following script (which I found on Statalist):

    Code:
    levels class, local(levels)
    gen std = .
    foreach l of local levels {
    qui sum mpg if class == `l'
    qui replace std = r(sd) if class == `l'
    }

    Which I have unsuccessfully tried to edit to create standardized values, instead of standard deviations for my own data:

    Code:
    levels time, local(levels)
    gen std_ME = .
    foreach l of local levels {
    qui sum log_ME if time == `l'
    qui replace std = r(std) if time == `l'
    }


    I would like to stress that I have some very basic knowledge of Stata, and little or no programming experience, so my question might be vague/unclear. If needed, I am more than happy to supply additional information.

    Thanks in advance, Martin
    Last edited by Martin Pott; 16 Jun 2014, 06:06.

  • #2
    summarize does not leave behind r(std), reason enough for your code not to work.

    It seems anomalous that std() in egen does not support by: but it is easy to work round that:


    Code:
    foreach v in ME BTM {
         bysort time : egen mean = mean(`v')
         by time: egen sd = sd(`v')
         gen std_`v' = (`v' - mean) / sd
         drop mean sd
    }

    Comment


    • #3
      It works like a charm. Thank you very much!

      Kind regards,


      Martin

      Comment


      • #4
        Hello
        I have a question concerning standardizing variables in time-series data. I am working with a data set of car purchases 2002-2012 that includes prices, quality, advertising spending and other variables per model and I am trying to predict sales per model. My question is whether it makes sense in my case to z-standardize the variables per period and/or per car segment (e.g., sports cars)? see also http://www.stata.com/statalist/archi.../msg00006.html Thank you

        Comment


        • #5
          I tried the code in this post, but it killed my t-statistics, which were very large before. My code was:

          Code:
          foreach i in ELAProfrate Mathprofrate Mathavg Englishavg {
               bysort Year: egen mean = mean(`i')
               by Year: egen sd = sd(`i')
               gen std_`i' = (`i' - mean) / sd
               drop mean sd
          }
          Any hints on what I got wrong?

          Comment

          Working...
          X