Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Z-scores over a sum of partly missing variables

    Dear Experts,

    I am trying to calculate z-scores over a sum of variables sorted by month.

    Code:
    bysort month_id: egen z_profitability=std(var1+var2+var3+var4+var5)
    If only one of the variables is missing, the whole z-score for that row will be missing, so I am trying to include only those variables in the std-function that are nonmissing.

    I managed to create a string which contains the nonmissing variables per row with the following code but I am unable to "insert" this string into the std-function.

    Code:
        gen nonmissing_vars = ""
        foreach var of local vars {
        replace nonmissing_vars = nonmissing_vars + "`var'+" if !missing(`var')
        }
        replace nonmissing_vars = substr(nonmissing_vars, 1, length(nonmissing_vars) - 1)
    
        gen z_profitability = .
        forval i = 1/`=_N' {
        local curr_var_list = nonmissing_vars[`i']
        egen z_profitability[`i'] = std(inlist("`curr_var_list'"), missing)
        }
    Does anyone know how to "insert" this string as the sum of variables into the std-function? Other approaches to my problem are offcourse also highly welcomed.
    Thanks in advance!


  • #2
    The longer block of code is fairly confused, unfortunately. It seems to be based more and more as the code proceeds on ideas from quite different languages. You can't use subscripts like that in an egen statement and the text of the argument to std() can't vary observation by observation any way.

    If we back up to

    Code:
    bysort month_id: egen z_profitability=std(var1+var2+var3+var4+var5)
    then what you want might be more like
    Code:
    egen varsum = rowtotal(var1 var2 var3 var4 var5)  bysort month_id: egen z_profitability = std(varsum)
    noting thatvarsum carries no information on the number of values it is based on. Whether this is a good way to work with missing values is naturally a key issue.

    Comment


    • #3
      Sorry for the flaws in my code, it is my second week with stata.
      Your approach using rowtotal works well. Now it is my turn to figure out if this makes statistically sense for my analysis.

      Thank you very much!

      Comment

      Working...
      X