Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pearsons Correlation with multiple imputations

    Dear Statalisters,

    I am trying to calculate Pearson's correlations for 2 Variables in a dataset with M=10 imputations.
    Since the pwcorr command is not supported by the mi command in Stata, I searched for an alternative and found out that I can combine the correlations coefficients using Fisher's z transformation.
    Following the example for combining R^2 (http://www.stata.com/support/faqs/st...-imputed-data/), I tried the following command:

    mi query
    local M=10
    scalar corr=0
    mi xeq 1/`M' : pwcorr v1 v2 scalar corr = corr + atanh(sqrt(e(corr)))
    scalar corr = tanh(corr/`M')^2
    di as txt "Correlation using Fisher's z over imputed data = " as res corr

    However, even though Stata is calculating the individual correlations for each m it does not show anything when it comes to the final result. All I get is

    . scalar corr = tanh(corr/`M')^2
    . di as txt "Korrelation using Fisher's z over imputed data = " as res corr
    Korrelation using Fisher's z over imputed data = .

    I am not an expert on doing these kind of calculations in Stata and think that maybe the formula from the R^2 example can not be used for correlations, but I am lost on who to adapt it in order to get it to work.
    Any help would be greatly appreciated,
    Maleika

  • #2
    There are many things here that need to be changed.

    First, pwcorr returns results in r() not e(), so there is no e(corr). There is also no r(corr). You probably want r(rho) here. Please read the help file for pwcorr, where all this is documented. Second, not that it matters much, but I do not get the point of using pwcorr in the first place. With only two variables, correlate will do. Even if you had more than two variables, which by the way would require combining results in the matrix r(C), there should not be missing values in imputed data. So, again, correlate would seem the more natural choice. If there were missing values in your imputed variables, then I would seriously think about what it means to combine multiple imputation with a pairwise approach to missing data. Third, why do you want the square root of the correlations before transforming them? These are already the rs, not R-squared. Last, concerning your code, I do not believe that this is actually what you typed, since the line

    Code:
    mi xeq 1/`M' : pwcorr v1 v2 scalar corr = corr + atanh(sqrt(e(corr)))
    does not contain a semicolon, Stata should have issued an error here. Note that showing what you typed, exactly, is often critical for helping others to identify problems and help you.

    With this said, try

    Code:
    mi query
    local M=10
    scalar corr=0
    mi xeq 1/`M' : correlate v1 v2 ; scalar corr = corr + atanh(r(rho))
    scalar corr = tanh(corr/`M')
    di as txt "Correlation using Fisher's z over imputed data = " as res corr
    Best
    Daniel

    Comment


    • #3
      Dear Daniel,

      thank you so much for your help. It worked great!

      I am sorry about the semicolon. I changed the name of the two variables to v1 and v2 here, because the original names are quiet long. I must have erased the semicolon by accident.

      I have one last (maybe stupid) question: Is there a way to obtain a significance level for the combined correlation parameters or at least a confidence interval?

      Best
      Maleika

      Comment


      • #4
        For only two variables, you could combine the square root of R-squared obtained from regress, then compute a confidence interval as suggested here.

        Best
        Daniel

        Comment


        • #5
          Thank you again Daniel.

          Right now, I am only looking at two variables., so I will do as you suggested for now.

          However, if I were to take a third variable into account as well, would it be acceptable to calculate a regression for each pair individually and combine the square root or is there another option for more then two variables?

          Comment

          Working...
          X