Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New variable: Proportion of interviews conducted in spanish

    Hi all, I am trying construct a variable that captures the proportion of interviews conducted in English. To provide context, I have three language of interview variables from NHANES (languagesp, languagemec, languagefam). Each is coded dichotomously as 0=Spanish/ 1= English. If one of the interviews is conducted in English I would want the proportion of interviews conducted in English to reflect (0+0+1)/3=.33.

    I am unsure how to generate a variable that accomplishes this. Any suggestions would be appreciated.

  • #2
    Unless I'm missing something, it seems like you're looking for this:

    Code:
    gen proportion_english = (languagesp + languagemec + languagefam) / 3
    Or if there are missing values, you can treat missing values as 0 like this:

    Code:
    egen proportion_english = rowtotal(languagesp languagemec languagefam)
    replace proportion_english = proportion_english / 3

    Comment


    • #3
      I'm not sure the advice in #2 is right in the situation where there can be missing values for the language variables. It really depends on what the meaning of a missing value is.

      If the value of one of these variables is missing, it might be because that particular interview was never done. In that case if the values are, say 0, ., and 1, then the correct proportion is actually 1/2, not 1/3. If this is why we have missing values, I would do it as:
      Code:
      egen proportion_english = rowmean(languagesp languagemec languagefam)
      Another possibility is that the missing value represents "we don't know what language was used." This might happen if, for example, the data collection system failed to record the information even though the interview took place, or perhaps the interview was conducted partially in English and partially in Spanish and the person recording the information skipped the question, not knowing what to respond. In that situation, we would have to acknowledge that we actually don't know what the proportion of English interviews is, and it would be:
      Code:
      egen proportion_english = rowtotal(languagesp languagemec languagefam)
      replaceproportion_english = proportion_english/3 if !missing(languagesp, languagemec, languagefam)

      Comment

      Working...
      X