Computing confidence intervals for meta-analysis

John Whitton

Join Date: Mar 2021

Posts: 4
#1

Computing confidence intervals for meta-analysis

19 Mar 2021, 03:39

Hi there,

I am currently attempting to run a meta-analysis on the effects of debt on economic growth. I have compiled t-statistics from each of the papers in my analysis to use as my effect sizes, but do not have a way of finding their standard errors. I read in the meta-analysis manual that I can use confidence intervals instead of using the standard error, in the form:

"instead of the standard errors, specify the confidence intervals, and meta set will compute the corresponding standard errors based on them: . meta set es cil ciu".

I am struggling to figure out how to generate confidence interval variables that work as 'cil' and 'ciu'. I have attached a screenshot of my data above, in case that helps.

Would anyone be able to help me with this?

Thanks,

John
Tags: None
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#2

19 Mar 2021, 09:07

And despite my eyesight being pretty good, I'm struggling to see what's in your screenshot. This is why the FAQ asks for no screenshots.

That said, you did say you compiled t-statistics. Those are basically the effect size of the parameter of interest divided by its standard error. I don't see the effect size in your variable list, which you need. I can't quite tell how you define "debt" in this case, but in my field, we might have a bunch of studies (I'm making this up) on the effect of an antidepressant on depressive symptoms. We might convert those to a standardized scale, e.g. study 1 showed a benefit of 0.1 standard deviations over placebo, study 2 showed a benefit of 0.3 SDs over placebo, study 3 showed a benefit of -0.1 SDs, etc. You need the equivalent to that. It should be the numerator term in the t-statistic. I don't imagine that many studies report t-statistics to the exclusion of all other information.

For example, I think that this paper is one of the ones in your list - I could make out the name and the year, at least. I can't find the number -.222 anywhere in the paper, but I can see that there are regression tables which present the betas and the SEs below them in brackets. The betas should be the effect sizes. The tables clearly stated in the notes that the numbers in brackets are the standard errors. I glanced at the paper and I don't see t-statistics anywhere. Are you sure you were entering t-statistics in your column? All papers should report the betas, which is one piece of info you need. I have to imagine that they'll either report a standard error, or a confidence interval. Either piece of info can be used as your measure of uncertainty. If a paper somehow reports a beta and a t-statistic but no other information (never seen this in my field, at least), then you can actually reverse engineer the standard error with algebra.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
2 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#3

19 Mar 2021, 09:22

John:
as an aside to Weiven's helpful advice, have you laready taken a look at the community-contributed module -metan-?

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
John Whitton

Join Date: Mar 2021

Posts: 4
#4

20 Mar 2021, 07:07

Hi Weiwen and Carlo,

Thank you both very much for your input, it is greatly appreciated!

Sorry about the screenshot, I haven't posted on here before and was unaware of the FAQ rules. In terms of how I am defining debt, I am using a broad definition and essentially referring to all claims held on the government by domestic and foreign agents. The t-statistics that I have compiled in my column come from this paper that did a meta-analysis on the same topic, of which I am trying to extend the analysis. The paper uses the t-statistic as a dependent variable as it works as a comparison across the set of papers. The paper doesn't specify how they compiled the t-statistics, which isn't particularly helpful, especially as some of the papers (including the one that you linked above) have multiple regression specifications and samples. As such, I am not sure what the values of the effect sizes and standard errors are. My aim for my work is to add on additional papers to the ones they used and see whether or not it affects their outcome. However, as they are not clear about how they calculated their t-statistic values, I am struggling to do so with the papers that I have found.

However, your point that the t-statistic is essentially the effect size divided by the standard error is very helpful, I am assuming that the authors will have used to this calculate their t-statistic values and that must have been how they got the -0.222 value from the Hepp (2005) paper that you linked. I will use this to try and calculate the t-statistics for the papers that I have found. As an example, This is one of the papers that I am using, which also contains multiple regression specifications and examples. I am struggling to understand how to calculate a t-statistic for the paper overall when there are different regressions. Do I have to calculate a t-statistic for each one and calculate an average value? For when I am using Stata and declaring my es and se variables, I previously thought that I had to use the t-stat value as my es value, but now based off of what you are saying think that I have to find the effect size instead from the t-stat value too. Is this correct?

This is my data sample using the command you specified:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input str33 AuthorsOfStudy str3 Result float(Year tstat) str46 Method str14 Model str3(NonLinearSpecification Panel) float obs str3(HIPIC ExternalDebt Published) float Pos "Patillo et al." "+" 2004 .84 "OLS, IV, SGMM, FE, LSDV" "Fixed" "Yes" "Yes" 455 "1" "1" "1" 1 "Hepp" "-" 2005 -.222 "" "" "No" "yes" 156 "1" "0" "0" 0 "Koray" "+" 1987 9.59 "OLS" "-" "No" "No" 28 "0" "0" "1" 1 "Presbitero" "+" 2005 .92 "SGMM, DGMM, OLS, LSDV" "Fixed" "No" "Yes" 350 "1" "1" "0" 1 "Schclarek" "+" 2004 .314 "SGMM" "-" "Yes" "Yes" 282 "1" "1" "0" 1 "Clements et al." "+" 2003 3.209 "SGMM, LSDV" "Fixed" "Yes" "Yes" 272 "1" "1" "1" 1 "Looney and Frederiksen" "+" 1986 1.96 "OLS" "-" "No" "No" 52 "0" "1" "1" 1 "Mohamed" "+" 2005 2.17 "" "" "No" "No" 24 "0" "1" "1" 1 "Blavy" "+" 2006 1.8 "OLS, IV, DGMM, SGMM" "Fixed & Random" "Yes" "Yes" 383 "0" "0" "1" 1 "Maghyereh et al." "+" 2002 2.05 "OLS" "-" "Yes" "No" 31 "0" "1" "0" 1 "Weeks" "-" 2000 -3.02 "OLS" "-" "No" "No" 90 "0" "1" "1" 0 "Bjerg et al." "-" 2007 -1.96 "GLS" "Fixed" "No" "No" 157 "0" "1" "0" 0 "Scott" "-" 1995 -2 "GLS" "Fixed" "No" "No" 232 "0" "1" "1" 0 "Paudel and Shrestha" "+" 2006 1.53 "" "" "No" "No" 34 "0" "1" "1" 1 "Patillo et al." "-/+" 2002 2.47 "LSDV, IV, SGMM, OLS" "Fixed" "Yes" "Yes" 630 "1" "1" "1" 2 "Ali-abbas and Christensen" "-/+" 2007 -4.17 "SGMM, OLS " "Fixed" "Yes" "Yes" 279 "0" "1" "1" 2 "Cordella et al." "+" 2005 1.96 "SGMM, OLS" "-" "Yes" "Yes" 703 "1" "1" "1" 1 end

Carlo, I have not had a look at the -metan- module, but I will have a look at it now as well and see if it helps.

If either of you could provide any more help it would be greatly appreciated, and thank you so much for your help thus far.

John
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#5

20 Mar 2021, 07:46

John:
taking a look at the first paper you mentioned, I would consider -help meta regress- (the approach that was actually followed in https://mpra.ub.uni-muenchen.de/2147...aper_21474.pdf).

Kind regards,
Carlo
(Stata 19.0)
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#6

20 Mar 2021, 12:27

I skimmed the first paper you cited. They are indeed using the t-statistic. You should be able to simply extract the beta related to debt (making sure it's measured the same way across all papers, presumably something like debt as a percentage of GDP, measured in percentage points), and divide by the standard error. Then you don't even need to use the built in meta analysis command, because we have come to a bit of a divergence in how this paper thinks of meta analysis and how it's done elsewhere.

In the stock command, we would weight each individual study by the inverse of its variance (that is the variance of the estimated effect size). Basically, the bigger studies will have smaller variances because they're bigger. Say you had one big trial that estimated a 0.1 SD increase in the dependent variable, and two tiny trials that estimated a 1.0 SD increase. Are you really going to take the average of 0.1, 1, and 1, and say that this thing is very effective? Would you not trust the large trial more? Most people would, and the meta analysis approach I am familiar with does account for that. It's described with equations in the documentation for the meta analysis commands. To some extent, the method used here does account for each study's variance by using the t-statistic; a small study with a large effect size would probably still have a small t-statistic. That's fine, but I have some problems with it as detailed below.

As far as I can tell, the first paper is just treating each study, and in some studies, each regression specification of the multiple ones that they presented, e.g. OLS vs Arellano-Bond vs Blundell-Bond estimators, as observations in an OLS regression with the t-statistic and the elasticity as separate dependent variables. Now, a bigger issue arises in how they created their sample, described under section 2:

The meta-analysis employs data on 17 studies, 12 of which are published in peer-reviewed academic journals. The papers are obtained from searches of JSTOR, EBSCO and ProQuest for key words ‘debt growth’.3 A search was also conducted of working paper series via working paper databases via Google Scholar as well as the references of the papers obtain from the sources listed above. Of the papers, 7 used observations for Highly Indebted Poor Countries (HIPC). Some of the papers reported had different regression specifications and samples, as a result, multiple results are included in the sample for some papers. For example, Pattillo, et al. (2004), Hepp (2005), Patillio, et al. (2002) and Ali-Abbas and Chirstensen (2007) all had multiple model specifications in the study. Therefore, a total of 62 observations were used in the meta-analysis.

My first thought on this issue is that counting multiple regression specifications is like double-counting results obtained from the same data. That would skew the point estimate of the result towards studies with those multiple specifications. My instinct would be to choose the most credible regression specification or estimator. For example, if they are presenting an OLS versus an instrumental variable estimate, then I'd take the IV estimate - if the IV weren't credible, then most likely I'd have failed the study in quality control. Secondarily, I'd suspect these studies may cover overlapping country-years - there are only so many countries to go around, after all. It seems like this might be a slightly lesser issue as well; it simply does not happen in my field (because patient data from randomized trials is protected and never shared between different researchers), so I lack the experience to say how much of an issue it is.

Also, in most meta analyses, I want to know how much some treatment changes the outcome, and a confidence interval around that estimate. Counting each regression specification in studies with multiple makes the regression think it has more information than it actually does. This artificially shrinks the standard error of the estimate. Moreover, say someone asked, "So, based on the meta-analysis, what percentage point change in GDP is associated with a public debt per capita increase of one percentage point?" If you treat the t-statistic as the dependent variable, you fundamentally can't answer this question. You could only answer the binary question: does an increase in debt increase economic growth or not. That is less satisfying.

More about standard errors and confidence intervals. In my field of health services research, we usually don't assume that all interventions have the exact same effect. You might have a bunch of trials of antidepressants, but you might have different drugs in the same drug class, or even if the exact same drug they might have slightly different dosages or timings of administration, you will have organizational factors that differ between sites, the patient characteristics will differ between trials, etc. We would normally start with random effects meta analysis to decompose the variance between and within studies. In fixed effects meta-analysis, you assume that each study has the exact same effect, and any variation in the effect estimate is sampling variation; you might meet this if you had a bunch of studies of very similar drugs at very similar dosages in pretty similar patients. In meta-analyses of organizational interventions, you will for sure have more variation, because now context is more important, and in fact the exact contents of the intervention could vary between trials. For example, you might have a bunch of studies of the effect of hospital pre-surgery checklists. Even if they are all for, say, hip replacement surgery in academic medical centers (which would be oddly specific), the items in the checklists could differ. The effects of public debt on economic growth do, at first glance to me, seem more like the sort of thing you would want to start with a random effects meta analysis. I'd guess that the method outlined is more equivalent to a fixed effects meta analysis.

The fixed versus random meta analysis subject is a bit epistemologically difficult. The Stata manual has some reading on it, and here's another article that goes into a bit of depth. If fixed effects meta analysis is an accepted practice in this particular field, then I can withdraw that objection. However, it seems like double counting regression specifications is a big issue. I could be wrong and I've certainly been wrong in the past, but if I were reviewing a journal article like that I would object vociferously.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
1 like
Comment
John Whitton

Join Date: Mar 2021

Posts: 4
#7

22 Mar 2021, 04:04

Hi Weiwen,

I can't thank you enough for taking the time to help me out and writing such an in-depth reply, it really is greatly appreciated.

I have been looking through the papers used in the previous meta-analysis I cited and have tried to reverse engineer the t-statistic values that they used by attempting to extract the beta related to debt and dividing by the standard error. However, I am struggling to recreate their t-statistic values. For example, looking at this paper from their analysis, for which the meta-analysis calculated a t-statistic value of 0.314, I cannot figure out which beta and standard error they used to get 0.314. The paper includes 18 different regression tables, which consists of many different variables, and so I not sure where to look for the exact beta and standard error that they used. I know you mentioned that it should be the beta related to debt (measured the same way across all of the papers), however I am yet to figure out which beta this is. This brings me to the point that you make about the paper double-counting results from the same paper and skewing the results in favour of the papers that included multiple regression specifications. I think this is very interesting and something I had not considered. It is certainly a discussion point I will address when conducting more of my work.

I also think that your point on how the t-statistic can only be used to address whether an increase in debt increases growth or not, instead of by how much growth increases after an increase of one percentage point of debt, is a very valid point. I will spend some time thinking about whether there is something else that I can use for a dependent variable, I know that the previous paper also used the elasticity of growth with respect to debt as a second dependent variable. When looking through the Stata meta-analysis manuals on what type of model I should use (random vs fixed vs common) I had come to the conclusion that a random effects model would be my best option, and it is nice to hear that you think the same.

Thank you so much again for all of your help, it made my day reading your reply and it has given me a few extra thoughts about where I can take my analysis. If you could help me out at all with figuring out exactly how they calculated their t-statistics it would be immeasurably helpful.

John.

Carlo - I will also have a look at help meta regress and see whether that helps, thank you very much for your help as well.
Comment
Weiwen Ng

Join Date: Jun 2015

Posts: 1241
#8

22 Mar 2021, 10:53

Hmm. The Schclarek and Hepp papers do bring up a separate issue of subgroup analysis. In my field, you'd normally report results from the whole RCT. Separately, though, you might have said a priori (and publicly filed in your analysis plan) that you were doing to do a separate subgroup analysis on, for example, women - here, you might have re-done the RCT regression and added a beta (and interaction terms) for women, or maybe you subset the population to women. Or you might have decided to do this post hoc. A systematic reviewer would probably be looking for at least some subgroup studies. You have one section for the main analyses, then we would do separate analyses where we might be able to pool some studies addressing a subgroup.

Schclarek appears to be doing one set of analyses for developing countries and another for industrial countries. In principle, I don't think I'd object if someone did a meta analysis with one observation from developing and one from industrial. This author is reporting the results completely separately from the beginning. Each set has its own estimate. Then, however, he is also splitting his analysis by total external debt to GDP, total external debt to exports, public debt to GDP and to exports, private debt split the same way, and a few others. My inclination is that you definitely can't count the total debt set as one study, then count the public and private debt studies as additional ones. You would have been double counting here (I assume that public debt and private debt are subsets of total debt). I think it would be most sound to just count the total debt regression (assuming you're doing the same in other studies). You could later do subgroup analyses for public and private debt. You definitely can't double count each of the 5 regression specifications where he adds control variables - I'd choose the most credible single set. If you enter the results from developing and industrial countries separately, I am leaning towards that being OK (and then naturally, an indicator for which types of countries should probably be in your meta-regression).

Hepp seems to present analyses for heavily-indebted poor countries, non-HIPC developing countries, and all developing countries. Again, all developing + HIPC + non-HIPC is definitely not OK. HIPC + non-HIPC may be OK. Counting each of 5 regression specifications is not OK.

Anyway, even if you can't tell how to replicate the author's exact results, you might just want to do things from scratch to see where you get. There's an art as well as a science to this. Sticking purely to the mechanics of how the computer is doing the meta-analysis or meta-regression, that bit is easy. There are accepted scientific principles for how to thoroughly search and assess study quality in systematic reviews, but these are fuzzy compared to the math, and there's more room for error here (and I don't want to be mean, but you're skipping this part in this analysis; that can be fine but do be aware of it). Then there's the more purely art side of how to define subgroups in a principled fashion, what question to ask, stuff like that.

Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.
Comment

Announcement

Computing confidence intervals for meta-analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment