Comparison between two Weighted Means

Mich Prov

Join Date: Jul 2016

Posts: 49
#1

Comparison between two Weighted Means

05 Mar 2018, 12:04

Only a simple question:

if I have 17 studies (code==1 is treatment A and code==2 is treatment B) and I have for each study the N of patients in treatment A (n_a) and N of patients in treatment B (n_b) and the mean of age for treatment a (age_a) and b (age_b). How can I compare the 17 means of age by accounting (weighting) for the N of patients in each study ?

I suppose that I can compute: "sum age_a [iweight=n_a]" and "sum age_b [iweight=n_b]" to gain the weighted for N - mean for all 17 studies... than can I use a simple t.test?
Tags: None
Mich Prov

Join Date: Jul 2016

Posts: 49
#2

05 Mar 2018, 12:07

To clarify: I have to compare the weighted mean of age between the two treatment group A and B!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#3

05 Mar 2018, 12:11

From what you describe, you have aweights, not iweights. Check the Stata PDF documentation that comes with your installation on the definitions of the different type of weights and then think about your data to see which category fits best. From what you have said, it really sounds like aweights.

Anyway, assuming it is aweights, you can do this:

Code:

mean age [aweight = npatients], over(code) test A = B

where npatients is the name of the variable containing the number of patients in each study, and A and B are the value labels attached to your variable code.

In the future, when asking for help with code, include example data in your post. That way we don't have to write sentences about what the variable names might be. We also don't have to make assumptions about data storage types or other details that may be crucial to getting the code to work correctly if you actually show the data instead of leaving it to the reader's imagination.

There is a great tool for showing example data on this forum: -dataex-. (I'm surprised after 25 posts that nobody has yet called your attention to it.) If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code.

Always show example data when asking for help with code. Always use -dataex- when showing example data.
Comment
Mich Prov

Join Date: Jul 2016

Posts: 49
#4

05 Mar 2018, 12:22

Oh thanks... I will follow your instruction for future posts!!!
Anyway I ran this command and seems to work but when I run test A=B it appears: Costraint 1 dropped... I'm trying to solve this... thank you professor!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#5

05 Mar 2018, 12:29

Well, this would definitely be easier to resolve had you showed the actual commands and output. It may have to do with the way Stata names what it posts in e() and _b[], which will depend on your variable names and their labels, and sometimes depends on the version of Stata you are running. If you do this:

Code:

mean age [aweight = npatients], over(code) coeflegend

you will get the output again, this time with the appropriate references to _b[] shown in the right side of the output table. Then you can run -test _b[whatever] = _b[whatever_else]-, using the _b[] notations shown in the output, and you shouldn't have a problem.

The less information you give, the harder it is for people to help you. The more you show, the easier it is. If code isn't doing what you expect it to do, showing the code and the output you're getting is helpful. The more you leave to people's imagination, the less likely they are to respond, and if they do respond, the less likely it is that their suggestions will actually work.
Comment
Mich Prov

Join Date: Jul 2016

Posts: 49
#6

05 Mar 2018, 12:55

Great... it works in excellent manner!
Is it possible to obtain a similar comparison fro percentages (i.e. gender (0/1) rather than age) ? or median comparison (like proteinuria rather than age) - suppose like with a non simmetric test ?
Comment
Mich Prov

Join Date: Jul 2016

Posts: 49
#7

05 Mar 2018, 12:55

I'm looking at the way to share the data! thanks
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#8

05 Mar 2018, 14:23

There is a -proportion- command that works very much like -mean-. It does not support -aweights-; but here it would be appropriate to use -iweights- because of the way it arrives at standard errors.

To contrast medians, I would probably go with the -qreg- command, again using iweights for this one.

See the help files for both of these commands.
Comment
Mich Prov

Join Date: Jul 2016

Posts: 49
#9

05 Mar 2018, 14:49

Amazing!
Comment

Mich Prov

Join Date: Jul 2016
Posts: 49

#10

11 Mar 2018, 09:14

Dear Prof Schechter,
according to this previuous topic I'm trying to compare the percentages of male gender between code==1 and code==2 (treatment A and B) :
So lauched the command:

"qreg code male[iweight = n_pat]" but the output is

male is the percentage of male (i.e. 60 % , 35%, 28) for each study with n patient= n_pat... I would to compare this percentage with a test for non-normal variables because male percentages are non-normal distrubuted in code==1 and code==2

How can I run?

Comment

Mich Prov

Join Date: Jul 2016

Posts: 49
#11

11 Mar 2018, 09:14

tried to attach the output like you said Professor!
kind regards
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#12

11 Mar 2018, 10:45

It does not make sense to use -qreg- to find the "median" of a variable like code which is a dichotomy. You originally mentioned medians in the context of a variable proteinuria, which I took to mean you had a quantitative measure of urine protein (not just a yes/no for presence of protein in the urine.) So the code would look like:

Code:

qreg urine_protein i.code

In the future, when asking for help with code, show data examples. And when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

When asking for help with code, always show example data. When showing example data, always use -dataex-.
Comment
Mich Prov

Join Date: Jul 2016

Posts: 49
#13

11 Mar 2018, 10:56

Nice, I have to chenge the order... it's correct!! But how can we add the [iweight=n_patients] to weight median of urine_protein for the n_patients of each study?

We did this for comparing means of age over code :
"mean age [aweight = npatients], over(code) coeflegend"
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30068
#14

11 Mar 2018, 11:16

Code:

qreg urine_protein i.code [iweight = weight_variable]

Mich: All Stata estimation commands have the same syntax for incorporating weights. The only way in which they differ is which kinds of weights they allow. So you don't need to ask this question for every command. The syntax is always the same as shown here. To learn which kinds of weights are allowed by the particular command you can check the help file for that command.
Comment
Mich Prov

Join Date: Jul 2016

Posts: 49
#15

12 Mar 2018, 04:52

Great!!! seems to work in a perfect manner!!
With respect to the name of these test that i could mention in the methods (statistical analysis)...Is it correct to speak about: "differences were tested using weighted quantile regression and weighted Fisher test, for non-normal and normal distributed continuous variables respectively" ?

Kind regards!
Comment

Announcement