statistical significance ratio difference

Nicolas Rodriguez

Join Date: Jul 2016

Posts: 63
#1

statistical significance ratio difference

29 Oct 2018, 13:55

Hello everybody, I have a question about to testing statisticas significance between two ratios.
I have a data as follows:

PHP Code:

ano sexo edad expr varstrat varunit activ 2013 mujer 36 32 11121 11121100 ocupados 2013 mujer 30 32 11121 11121100 ocupados 2013 hombre 41 32 11121 11121100 ocupados 2013 mujer 41 32 11121 11121100 ocupados 2013 mujer 10 32 11121 11121100 2013 hombre 9 32 11121 11121100 2013 hombre 73 32 11121 11121100 inactivo 2013 mujer 73 32 11121 11121100 inactivo 2013 mujer 60 32 11121 11121100 ocupados 2013 mujer 49 32 11121 11121100 ocupados 2013 mujer 21 32 11121 11121100 ocupados 2013 mujer 17 32 11121 11121100 inactivo 2013 hombre 1 32 11121 11121100 2013 hombre 37 32 11121 11121100 ocupados 2013 mujer 33 32 11121 11121100 ocupados 2013 hombre 3 32 11121 11121100 . . .

In summary, I have over 400.000 observations from a survey data from 2013 and 2017. I need to test stastistical significance between two ratio: The percentage of employed people in 2013 vs 2017. To do that I use the svy and ratio command, and to test the difference I use de lincom command, and I got the following:

PHP Code:

Survey: Ratio estimation Number of strata = 607 Number of obs = 434,930 Number of PSUs = 3,464 Population size = 35,080,531 Subpop. no. obs = 106,224 Subpop. size = 8,671,466 Design df = 2,857 _ratio_1: tredad_1/ocupados 2013: ano = 2013 2017: ano = 2017 -------------------------------------------------------------- | Linearized Over | Ratio Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ _ratio_1 | 2013 | .2332271 .0033352 .2266874 .2397668 2017 | .2243416 .0038795 .2167346 .2319486 -------------------------------------------------------------- . lincom [_ratio_1]2013 - [_ratio_1]2017 ( 1) [_ratio_1]2013 - [_ratio_1]2017 = 0 ------------------------------------------------------------------------------ Ratio | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | .0088855 .0051192 1.74 0.083 -.0011521 .0189231 ------------------------------------------------------------------------------

Therefore, my question is whether it is correct to use the lincom command to obtain the statistical signficance difference between this two ratios.
Thank you very much for all your comments.
Kind Regards.
Tags: None
Nicolas Rodriguez

Join Date: Jul 2016

Posts: 63
#2

30 Oct 2018, 06:00

If it helps, the code I use to get the above results is as follows:

PHP Code:

use Base_1, clear gen tedad=. replace tedad=1 if edad>=15 & edad<=29 replace tedad=2 if edad>=30 & edad<=44 replace tedad=3 if edad>=45 & edad<=59 replace tedad=4 if edad>=60 tab tedad, g(tredad_) # delimit; label define tedad 1 "15-29 años" 2 "30-44 años" 3 "45-59 años" 4 "60 años o más"; # delimit cr label values tedad tedad label variable tedad "T Edad" gen ocupados= (activ==1) svyset varunit [w=expr], strata(varstrat) vce(linearized) singleunit(certainty) svy, subpop(if activ == 1 & sexo==1): ratio tredad_1/ocupados, over(ano) lincom [_ratio_1]2013 - [_ratio_1]2017
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#3

30 Oct 2018, 08:25

Yes it's perfectly all right to use lincom in this way. However, basing your conclusion on "statistical significance" alone might be wrong. If there had been a census (complete enumeration) in both years, would the employment rates be identical? Of course not. So before doing the calculation, you know that the null hypothesis is false. The real question is "how different" were the rates and that question is answered by the confidence interval for the difference.

I might also be informative to report the ratio of the rates or the percent change in rates:

Code:

nlcom 100*([_ratio_1]2017-[_ratio_1]2013)/ [_ratio_1]2013

Last edited by Steve Samuels; 30 Oct 2018, 08:46.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
1 like
Comment
Nicolas Rodriguez

Join Date: Jul 2016

Posts: 63
#4

30 Oct 2018, 08:54

Thank you very much for your comments Steve. I perefectly understand your point. So, in this case, how can I interpret the confidence interval to answer the real quiestion of "how different" are? Or I have to do something else (another command) to get the confidence interval for the difference?

Also, with the command that you gave me I got the following:

PHP Code:

nlcom 100*([_ratio_1]2017-[_ratio_1]2013)/ [_ratio_1]2013 _nl_1: 100*([_ratio_1]2017-[_ratio_1]2013)/ [_ratio_1]2013 ------------------------------------------------------------------------------ Ratio | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _nl_1 | -3.809808 2.159771 -1.76 0.078 -8.042881 .4232647 ------------------------------------------------------------------------------

How can I interpret that?

Last edited by Nicolas Rodriguez; 30 Oct 2018, 08:59.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#5

30 Oct 2018, 09:43

1. To interpret the original CI of the difference, I would 1) put the rates in terms of the units in which they are usually reported (per ten thousand?) and 2) express it as a change from 2013

Code:

10000*[_ratio_1]2017 - [_ratio_1]2013

Then the estimated change per 10,000 would be - 88.9 with confidence interval [ - 189.2 to +11.5]

2. On looking at the relative decrease in rates or the ratio of rates, I don't find either one easy to understand. The ratio is only a little better:

The employment rate in 2017 was 96.2% of the rate in 2013. Confidence interval 92.0% - 100.4%

My advice is to stick to the change.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Announcement

statistical significance ratio difference

Comment

Comment

Comment

Comment