Calculate country-specific thresholds using continuous health index and country health distribution

Tho Dang

Join Date: Jul 2018

Posts: 3
#1

Calculate country-specific thresholds using continuous health index and country health distribution

28 Aug 2018, 23:51

Dear Statalisters,

I am using Stata 13. I am analysing the Survey of Health, Aging and Retirement in Europe (SHARE) data, waves 1, 2, 4, 5 and 6, to investigate the effects of health on labour force participation of the older workers. The main explanatory variable, sph, is an ordinal variable, coded 1 as Excellent, 2 as Very good, 3 as Good, 4 as Fair, and 5 as Poor. I computed a health index to address the "state-dependent reporting bias" in self-reported health, by firstly running generalised order probit regression (goprobit) of self-reported health on a set of quasi-objective health indicators, i.e. self-reports of chronic conditions. From the goprobit results I calculate the disability weight for each condition, then substract the total disability weights from 1 to obtain a health index. The health index, z_index, now is a continuous variable, ranging from 0 to 1 after a normalisation.

Given the health index variable, I want to calculate the country-specific thresholds [as the exact quantiles of the country-specific health index distribution that correspond to the proportion of respondents that report up to a specific health level] (Jurges 2004). As I understand, I need to tabulate the original self-reported health variable (sph) by country and wave to obtain the cumulative percentages of the (country) population reported their health status in each categories, then _pctile the z_index using these cumulative percentages. However the stored results after tabulation contain only two scalars r(N) for total observation and r(r) for total number of categories of the dependent variable, without the cumulative percentages.

As there are between 12 to 28 countries in each wave so it would take a long while to do tab - _pctile by hand as it involves typing a lot of numbers. Therefore, I think it may be quicker writing a program then loop it for each country and each wave.

My trial codes are as follows:

Code:

[capture program drop pcal program define pcal // tabulate sph to get frequencies of each sph category & save the frequency matrix tab sph if `1'==1 & `2'==1, matcell(A) // extract frequency of each category and put in scalars scalar n=r(N) forval i=1/5 { scalar r`i'=A[`i',1] } // generate scalars as cumulative percentages scalar c1=r1/n forval i=2/5 { scalar c`i'=c`i-1'+r`i'/n } // store scalar values forval i=1/5 { scalar ce`i'=e(c`i') } // calculate percentiles of z_index based on determined cumulative percentages of sph _pctile z_index, p(`ce1', `ce2', `ce3', `ce4', `ce5') return list // drop all generated scalars and matrices scalar drop _all matrix drop _all end // Trial executation of the program pcal pcal austria 1 Self-percei | ved health | Freq. Percent Cum. ------------+----------------------------------- 1 | 1,304 7.96 7.96 2 | 3,863 23.58 31.54 3 | 5,844 35.67 67.21 4 | 4,115 25.12 92.33 5 | 1,257 7.67 100.00 ------------+----------------------------------- Total | 16,383 100.00 option p() incorrectly specified r(198);]

When I tried to run the program for the first country (austria==1), as a dummy variable coded 1 as Austria and 0 as Not Austria, and first wave (wave==1), it returns error code r(198) option p() incorrectly specified.

I know that my program might be very basic for many of you here but this is the best I can do (as a very beginner Stata user). Please kindly suggest how should I fix the program and are there better options to work it out around the issue?

Your comments are greatly welcome.

Thanks,
Tho
Tags: categorical, goprobit, health index, Stata program, _pctile
Clyde Schechter

Join Date: Apr 2014

Posts: 30114
#2

29 Aug 2018, 00:01

Well, -pcal- is not part of official Stata, nor does -search pcal- turn up any information about it. (I also could find nothing about it with a Google search.) But -pcal- is the program that is throwing the error message. Since you apparently found this program somewhere and have it installed, I suggest you read its help file (-help pcal-) to see what option p() is supposed to be, and then add it, properly specified in accordance with whatever the help file says, to your -pcal- command.

If that doesn't solve your problem you have two choices. You can wait for a day or two to see if some other Forum member who is familiar with -pcal- chooses to respond to your questions. Or, you can contact the author of -pcal- directly for advice. (Most authors of community-contributed Stata programs put their names and contact information in the help file.)
Comment
Tho Dang

Join Date: Jul 2018

Posts: 3
#3

29 Aug 2018, 06:41

Dear Clyde,

Many thanks for your quick response. Actually -pcal- is the program that I have written in order to solve my stated issue and am asking for comments on it. I know that the problem causing -pcal- not executing properly is something wrong with the p() option. I suppose that locals, i.e. the `ce1', `ce2' within the parentheses are not accepted as a numlist for p(). But I do not know how to fix it.

Any further advice please.

Kind regards,
Tho
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#4

29 Aug 2018, 07:10

The only p() option I see is the call to _pctile. Since in

Code:

_pctile z_index, p(`ce1', `ce2', `ce3', `ce4', `ce5')

your arguments were created as scalars, not locals, you should instead code

Code:

_pctile z_index, p(ce1, ce2, ce3, ce4, ce5)

See help scalar and the associated PDF documentation for more details. For example,

Code:

. scalar a = 4 . display sqrt(a) 2
Comment

Tho Dang

Join Date: Jul 2018
Posts: 3

29 Aug 2018, 07:43

Dear William,

Thanks you. I revise the program as you suggested, and Stata still returns the same error code.

Code:

option p() incorrectly specified
r(198);

The revised program is as follow, where I edit the second step (// generate scalars as cumulative percentages) a bit, to make it clearer.

Code:

capture program drop pcal
program define pcal

// tabulate sph to get frequencies of each sph category & save the frequency matrix
tab sph if `1'==1 & `2'==1, matcell(A)

// extract frequency of each category and put in scalars
scalar n=r(N)
forval i=1/5 {
scalar r`i'=A[`i',1]
}

// generate scalars as cumulative percentages 
scalar c1=r1/n
scalar c2=c1+r2/n
scalar c3=c2+r3/n
scalar c4=c3+r4/n
scalar c5=c4+r5/n

// store scalar values
forval i=1/5 {
scalar ce`i'=e(c`i')
}

// calculate percentiles of z_index based on determined cumulative percentages of sph
_pctile z_index, p(ce1, ce2, ce3, ce4, ce5)

// drop all generated scalars and matrices
scalar drop _all
matrix drop _all
end

// Trial executation of the program pcal
pcal austria 1
return list

Self-percei |
 ved health |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |      1,304        7.96        7.96
          2 |      3,863       23.58       31.54
          3 |      5,844       35.67       67.21
          4 |      4,115       25.12       92.33
          5 |      1,257        7.67      100.00
------------+-----------------------------------
      Total |     16,383      100.00
option p() incorrectly specified
r(198);

I am doubting if any of the previous commands before _pctile might not be right?

Comment

William Lisowski

Join Date: Dec 2014

Posts: 10150
#6

29 Aug 2018, 08:50

Let us look at this command, from the middle of a loop with i running from 1 to 5.

Code:

scalar ce`i'=e(c`i')

Now suppose we're on the third pass through the loop, so i is 3 and this command becomes

Code:

scalar ce3=e(c3)

where c3 will be a cumulative percentage, per the comment in the program, so it will be a number between 0 and 1. So suppose it is 0.42, the command is thus equivalent to

Code:

scalar ce3=e(0.42)

What does that mean? What is it supposed to accomplish?

My thought is that the loop creating the ce scalars is not necessary, and your _pctile command should be

Code:

_pctile z_index, p(c1, c2, c3, c4, c5)
Comment

Announcement

Calculate country-specific thresholds using continuous health index and country health distribution

Comment

Comment

Comment

Comment

Comment