Meta-analysis of correlation coefficients in Stata

Gobinda Natak

Join Date: Sep 2016

Posts: 79
#1

Meta-analysis of correlation coefficients in Stata

01 Aug 2018, 06:44

Dear Statalist

I'm currently working on a meta-analysis of correlation coefficients and am looking at the commands available in Stata. Imagine I have data like this:

Code:

input str10 study year r n Natak 1992 .40 50 Bundhi 1998 .50 100 Rashnam 2001 .40 18 Chetram 2002 .20 730 Sankaram 2008 .70 44 Chetty 2016 .45 28 end

I would have to Fisher's z-transform r and calculate its standard error like this:

[CODE]
generate z = .5 * ln((1 + r) / (1 - r))
generate sez = sqrt(1/(n - 3))
[\CODE]

And then I can use -metan-:

[CODE]
metan z sez, label(namevar = study, yearvar = year)
[\CODE]

Which would give me results like this:

However, the convention in meta-analyses seems to be to transform the Fisher's z effect sizes back into correlations for presentation purposes.

My questions are:
Is there any option in -metan- that would do that for me?

Is there perhaps any other Stata command for meta-analyses that would do that for me? So far I have the impression that none of the available commands does

Thanks for your consideration
Go
Tags: None

1 like
Red Owl

Join Date: Nov 2016

Posts: 127
#2

01 Aug 2018, 13:25

I would recommend using Fisher's z transform or converting the correlations to Cohen's delta effect sizes.

However, if you want to present the results as correlations in the forest plot, the standard error of r can be calculated as:

SE of r = sqrt(1 - r^2)/n - 2

Red Owl
Stata/IC 15.1 (Windows 10, 64-bit)
Comment
Gobinda Natak

Join Date: Sep 2016

Posts: 79
#3

02 Aug 2018, 05:54

Thanks for your response, Red Owl. Does that mean you'd suggest not using any of the available Stata commands for meta-analysis of correlation coefficients?

Cheers
Go
Comment
Red Owl

Join Date: Nov 2016

Posts: 127
#4

02 Aug 2018, 14:50

No, I was just offering the formula for estimating a standard error of a Pearson's correlation coefficient to use if you wanted to use r as your effect size in the forest plot.

I personally would convert the r to d (Cohen's d) and use that in the forest plot, but you may have good reasons to use r instead. I wasn't offering an opinion about what is best, just what is possible.

Red Owl
Stata/IC 15.1, Windows 10 (64-bt)
Comment

Bruce Weaver

Join Date: May 2014
Posts: 1133

02 Aug 2018, 15:58

Hi Gobinda. Note that the r-to-Z transformation is really the inverse hyperbolic tangent, so if you meta-analyze the Zr values, the hyperbolic tangent can be used to transform the pooled estimate back to the original scale. The Stata functions are atanh() and tanh().

Here's an example using your data.

Code:

clear *
input str10 study year r n
Natak    1992 .40  50
Bundhi   1998 .50 100
Rashnam  2001 .40  18
Chetram  2002 .20 730
Sankaram 2008 .70  44
Chetty   2016 .45  28
end
generate z = atanh(r) // r-to-z = inverse hyperbolic tangent
generate sez = sqrt(1/(n - 3))
metan z sez, label(namevar = study, yearvar = year)
display _newline ///
" Pooled estimate of r = " tanh(r(ES)) _newline ///
"Lower Limit of 95% CI = " tanh(r(ci_low)) _newline ///
"Upper Limit of 95% CI = " tanh(r(ci_upp))

Output from -metan- and -display-:

Code:

. metan z sez, label(namevar = study, yearvar = year)

           Study     |     ES    [95% Conf. Interval]     % Weight
---------------------+---------------------------------------------------
Natak (1992)         |  0.424       0.138     0.710          4.94
Bundhi (1998)        |  0.549       0.350     0.748         10.19
Rashnam (2001)       |  0.424      -0.082     0.930          1.58
Chetram (2002)       |  0.203       0.130     0.275         76.37
Sankaram (2008)      |  0.867       0.561     1.173          4.31
Chetty (2016)        |  0.485       0.093     0.877          2.63
---------------------+---------------------------------------------------
I-V pooled ES        |  0.288       0.225     0.352        100.00
---------------------+---------------------------------------------------

  Heterogeneity chi-squared =  27.78 (d.f. = 5) p = 0.000
  I-squared (variation in ES attributable to heterogeneity) =  82.0%

  Test of ES=0 : z=   8.90 p = 0.000

. display _newline ///
> " Pooled estimate of r = " tanh(r(ES)) _newline ///
> "Lower Limit of 95% CI = " tanh(r(ci_low)) _newline ///
> "Upper Limit of 95% CI = " tanh(r(ci_upp))

 Pooled estimate of r = .28071524
Lower Limit of 95% CI = .22121715
Upper Limit of 95% CI = .33813133

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)

Comment

Gobinda Natak

Join Date: Sep 2016
Posts: 79

03 Aug 2018, 09:45

Thanks both for helping me out here. What I learned from your comments and a bit more research is that I wanted to ultimately end up with this here, for which I needed the commands -admetan- and -forestplot-:

Code:

clear

input str10 study year r n
Natak    1992 .40  50
Bundhi   1998 .50 100
Rashnam  2001 .40  18
Chetram  2002 .20 730
Sankaram 2008 .70  44
Chetty   2016 .45  28
end

generate z = atanh(r) // r-to-z = inverse hyperbolic tangent
generate sez = sqrt(1/(n - 3))

admetan z sez

display _newline ///
" Pooled estimate of r = " tanh(r(eff)) _newline ///
"Lower Limit of 95% CI = " tanh(r(eff) - (1.96 * r(se_eff))) _newline ///
"Upper Limit of 95% CI = " tanh(r(eff) + (1.96 * r(se_eff)))

// Prepare data for -forestplot-
generate _USE = 1

  // Generate CI's for r
generate lb = tanh(_LCI)
generate ub = tanh(_UCI)

  // Generate study labels for -forestplot-
generate _LABELS = study + " (" + string(year, "%02.0f") + ")"

label var n "Sample size"

  // Add effect size to data set
local new = _N + 1
set obs `new'
replace _LABELS = "{bf:Overall}"                    if _n == _N
replace r       = tanh(r(eff))                      if _LABELS == "{bf:Overall}"
replace lb      = tanh(r(eff) - (1.96 * r(se_eff))) if _LABELS == "{bf:Overall}"
replace ub      = tanh(r(eff) + (1.96 * r(se_eff))) if _LABELS == "{bf:Overall}"
replace _USE    = 5                                 if _LABELS == "{bf:Overall}"

  // Forest plot
forestplot r lb ub, nonull effect("Correlation") rcol(n) leftjustify nowt

So that I could end up with this plot:

Click image for larger version

Name: Graph.png
Views: 1
Size: 54.4 KB
ID: 1456547

Which kind of looks like what I was interested in (for future reference: This post here https://www.statalist.org/forums/for...34#post1375334 seems to contain some deep knowledge on how to format -forestplot-).

Thanks again
Go

Comment

Red Owl

Join Date: Nov 2016
Posts: 127

03 Aug 2018, 19:06

Gobinda Natak and Bruce Weaver

I tested my approach calculating SE or r directly to compare the results to those produced by the r-to-z transform approach suggested by Bruce Weaver .

I got similar (but not exactly the same) results with fixed effects meta-analysis. I noticed, however, that the heterogeneity is relatively high (heterogeneity chi-square (I-squared) = 81.8%), suggesting that a random effects approach is probably warranted.

With the fixed effects approach, my overall mean r effect size is .296 [.236,.356], and the one produced with the r-to-z transform method is .288 [.225,.352].

With the random effects approach my overall mean r effect size rises to .435 [.240,.631].

Code:

clear

input str10 study year r n
Natak    1992 .40  50
Bundhi   1998 .50 100
Rashnam  2001 .40  18
Chetram  2002 .20 730
Sankaram 2008 .70  44
Chetty   2016 .45  28
end

* Estimate SE of r and format r and SEr
gen SEr = sqrt((1 - r^2)/(n - 2))
format r SEr %4.3f

* Generate Study var and labels
generate Study = study + " (" + string(year, "%02.0f") + ")"
label var n "Sample size"

* Fixed effects meta-analysis assuming homogeneity of effects
metan r SEr, lcols(Study) rcols(SEr n) astext(85) xlabels(0(.25)1) name(forestfixd, replace)

* Random effects meta-analysis assuming heterogeneity of effects
metan r SEr, random lcols(Study) rcols(SEr n) astext(85) xlabels(0(.25)1) name(forestrand, replace)

* Combined fixed and random effects forest plots
graph combine forestfixd forestrand, ysize(3) xsize(6) name(ROcombined, replace)

Click image for larger version

Name: forestplots.png
Views: 1
Size: 59.5 KB
ID: 1456608

Red Owl
Stata/IC 15.1, Windows 10 (64-bit)

Comment

Messmer George

Join Date: Jul 2019

Posts: 2
#8

18 Jul 2019, 20:52

I'm trying to use metan to do a meta analysis of pearson correlations with inverse variance weighting with some high sample heterogeneity. I input metan Z (fishers' z) and SE. How do I get metan to recognize n for sample size to do the weighting? If I use randomi or fixedi it doesn't seem to recognize the sample sizes?
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#9

19 Jul 2019, 07:23

Messmer George, I'm confused. How can you use inverse variance weighting and weighting by sample size at the same time? You have to pick one or the other, I think. Please clarify.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
Comment
Daniel Devoe

Join Date: Feb 2020

Posts: 2
#10

14 Feb 2020, 16:04

Hi All,

I am doing a meta to look at the relationship between negative symptoms and functioning in psychosis youth. My initial instinct was to use the Pearson r reported in the studies, convert the correlation coefficient to the Fisher’s z scale, and then perform random effects meta-analyses on the transformed values as suggested in the forum and transform back and your advice worked great. But now I am second guessing myself and I am wondering since the negative scales are different (SANS, SIPS, etc) and the functioning scales are different (GAF, SOFAS, etc) would you instead opt to transform Pearson r to SMD? Any suggestions/ thoughts? And if you recommend this how would one go about doing this? Or does the difference in scales not matter when uses z transformations? Thanks so much, Dan
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#11

14 Feb 2020, 16:18

Originally posted by Daniel Devoe View Post

But now I am second guessing myself and I am wondering since the negative scales are different (SANS, SIPS, etc) and the functioning scales are different (GAF, SOFAS, etc) would you instead opt to transform Pearson r to SMD? Any suggestions/ thoughts? And if you recommend this how would one go about doing this? Or does the difference in scales not matter when uses z transformations? Thanks so much, Dan

Hi Daniel. I can't speak for others, but I don't know what you mean when you say "the negative scales are different (SANS, SIPS, etc) and the functioning scales are different (GAF, SOFAS, etc)". Different from what? Please provide more information about the scales and the correlations you are trying to pool. And what is SMD? Standardized mean difference, perhaps?

Thanks for clarifying.

Bruce

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
1 like
Comment
Daniel Devoe

Join Date: Feb 2020

Posts: 2
#12

15 Feb 2020, 10:57

Hi Bruce, different from each other, in that they measure very similar concepts either negative symptoms or functioning but do so on different scales. The correlation I am trying to pool is the correlation coefficient r reported in studies that report the correlation coefficient r between functioning and negative symptoms, regardless of the scales used.

For example, I have studies reporting this relationship between negative symptoms and functioning using a variety of scales such as the "coefficient r between functioning scale a and negative symptom scale b", "coefficient r between functioning scale b and negative symptom scale c", coefficient r between functioning scale z and negative symptom scale e" etc.

For SMD, in this case I was referring to Cohen's d.

Thanks so much, Dan

Dan Devoe (BA, MSc, PhD Candidate)
Dept. of Psychiatry | Cumming School of Medicine | University of Calgary
TRW Building | Mathison Centre for Mental Health Research & Education
email:[email protected]
Comment

Announcement