Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculate a variable containing a percentage

    Dear All,

    I have the following dataset:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str22 country str3 reg_nuts1 byte a042
    "Germany"       "DE1"  0
    "Germany"       "DE1"  0
    "Germany"       "DE1"  0
    "Germany"       "DE1"  0
    "Germany"       "DE2"  0
    "Germany"       "DE2"  0
    "Germany"       "DE2"  0
    "Germany"       "DE2"  0
    "Germany"       "DE3"  0
    "Germany"       "DE3"  0
    "Germany"       "DE3"  0
    "Germany"       "DE3"  0
    "Germany"       "DE4"  0
    "Germany"       "DE4"  0
    "Germany"       "DE4"  0
    "Germany"       "DE4"  0
    "Germany"       "DE5"  0
    "Germany"       "DE5"  0
    "Germany"       "DE5"  0
    "Germany"       "DE5"  0
    "Germany"       "DE6"  0
    "Germany"       "DE6"  0
    "Germany"       "DE6"  0
    "Germany"       "DE6"  0
    "Germany"       "DE7"  0
    "Germany"       "DE7"  0
    "Germany"       "DE7"  0
    "Germany"       "DE7"  0
    "Germany"       "DE8"  0
    "Germany"       "DE8"  0
    "Germany"       "DE8"  0
    "Germany"       "DE8"  0
    "Germany"       "DE9"  0
    "Germany"       "DE9"  0
    "Germany"       "DE9"  0
    "Germany"       "DE9"  0
    "Germany"       "DEA"  0
    "Germany"       "DEA"  0
    "Germany"       "DEA"  0
    "Germany"       "DEA"  0
    "Germany"       "DEB"  0
    "Germany"       "DEB"  0
    "Germany"       "DEB"  0
    "Germany"       "DEB"  0
    "Germany"       "DEC"  0
    "Germany"       "DEC"  0
    "Germany"       "DEC"  0
    "Germany"       "DEC"  0
    "Germany"       "DED"  0
    "Germany"       "DED"  0
    "Germany"       "DED"  0
    "Germany"       "DED"  0
    "Germany"       "DEE"  0
    "Germany"       "DEE"  1
    "Germany"       "DEE"  0
    "Germany"       "DEE"  0
    "Germany"       "DEF"  0
    "Germany"       "DEF"  0
    "Germany"       "DEF"  0
    "Germany"       "DEF"  0
    "Germany"       "DEG"  0
    "Germany"       "DEG"  1
    "Germany"       "DEG"  0
    "Germany"       "DEG"  0
    "Spain"         "ES1"  0
    "Spain"         "ES1"  1
    "Spain"         "ES1"  0
    "Spain"         "ES1"  1
    "Spain"         "ES2"  0
    "Spain"         "ES2"  0
    "Spain"         "ES2"  0
    "Spain"         "ES2"  1
    "Spain"         "ES3"  1
    "Spain"         "ES3"  0
    "Spain"         "ES3"  0
    "Spain"         "ES3"  1
    "Spain"         "ES4"  1
    "Spain"         "ES4"  0
    "Spain"         "ES4"  0
    "Spain"         "ES4"  0
    "Spain"         "ES5"  0
    "Spain"         "ES5"  0
    "Spain"         "ES5"  0
    "Spain"         "ES5"  1
    "Spain"         "ES6"  0
    "Spain"         "ES6"  1
    "Spain"         "ES6"  0
    "Spain"         "ES6"  1
    "Spain"         "ES7"  0
    "Spain"         "ES7"  1
    "Spain"         "ES7" -1
    "Spain"         "ES7"  0
    "France"        "FR1"  0
    "France"        "FR1"  0
    "France"        "FR1"  0
    "France"        "FR1"  0
    "France"        "FRB"  1
    "France"        "FRB"  0
    "France"        "FRB"  0
    "France"        "FRB"  1
    "France"        "FRC"  0
    "France"        "FRC"  1
    "France"        "FRC"  0
    "France"        "FRC"  0
    "France"        "FRD"  0
    "France"        "FRD"  0
    "France"        "FRD"  0
    "France"        "FRD"  0
    "France"        "FRE"  0
    "France"        "FRE"  1
    "France"        "FRE"  0
    "France"        "FRE"  1
    "France"        "FRF"  0
    "France"        "FRF"  0
    "France"        "FRF"  0
    "France"        "FRF"  1
    "France"        "FRG"  0
    "France"        "FRG"  1
    "France"        "FRG" -1
    "France"        "FRG"  0
    "France"        "FRH"  0
    "France"        "FRH"  0
    "France"        "FRH"  0
    "France"        "FRH"  1
    "France"        "FRI"  1
    "France"        "FRI"  0
    "France"        "FRI"  0
    "France"        "FRI"  0
    "France"        "FRJ"  0
    "France"        "FRJ"  0
    "France"        "FRJ"  0
    "France"        "FRJ"  0
    "France"        "FRK"  0
    "France"        "FRK"  0
    "France"        "FRK"  0
    "France"        "FRK"  1
    "France"        "FRL"  0
    "France"        "FRL"  0
    "France"        "FRL"  0
    "France"        "FRL"  0
    "Italy"         "ITC"  0
    "Italy"         "ITC"  0
    "Italy"         "ITC"  0
    "Italy"         "ITC"  0
    "Italy"         "ITF"  1
    "Italy"         "ITF"  0
    "Italy"         "ITF"  1
    "Italy"         "ITF"  0
    "Italy"         "ITG"  0
    "Italy"         "ITG"  0
    "Italy"         "ITG"  0
    "Italy"         "ITG"  0
    "Italy"         "ITH"  0
    "Italy"         "ITH"  1
    "Italy"         "ITH"  1
    "Italy"         "ITH"  0
    "Italy"         "ITI"  0
    "Italy"         "ITI"  0
    "Italy"         "ITI"  0
    "Italy"         "ITI"  0
    "Netherlands"   "NL1"  0
    "Netherlands"   "NL1"  0
    "Netherlands"   "NL1"  0
    "Netherlands"   "NL1"  0
    "Netherlands"   "NL2"  0
    "Netherlands"   "NL2"  0
    "Netherlands"   "NL2"  0
    "Netherlands"   "NL2"  0
    "Netherlands"   "NL3"  0
    "Netherlands"   "NL3"  0
    "Netherlands"   "NL3"  0
    "Netherlands"   "NL3"  0
    "Netherlands"   "NL4"  0
    "Netherlands"   "NL4"  0
    "Netherlands"   "NL4"  0
    "Netherlands"   "NL4"  0
    "Portugal"      "PT1"  1
    "Portugal"      "PT1"  0
    "Portugal"      "PT1"  0
    "Portugal"      "PT1" -1
    "Great Britain" "UKC"  0
    "Great Britain" "UKC"  0
    "Great Britain" "UKC"  0
    "Great Britain" "UKC"  0
    "Great Britain" "UKD"  0
    "Great Britain" "UKD"  0
    "Great Britain" "UKD"  0
    "Great Britain" "UKD"  1
    "Great Britain" "UKE"  0
    "Great Britain" "UKE"  1
    "Great Britain" "UKE"  0
    "Great Britain" "UKE"  0
    "Great Britain" "UKF"  0
    "Great Britain" "UKF"  0
    "Great Britain" "UKF"  0
    "Great Britain" "UKF"  0
    "Great Britain" "UKG"  0
    "Great Britain" "UKG" -5
    "Great Britain" "UKG"  0
    "Great Britain" "UKG"  0
    "Great Britain" "UKH"  0
    "Great Britain" "UKH"  0
    "Great Britain" "UKH"  1
    "Great Britain" "UKH"  0
    "Great Britain" "UKI"  0
    "Great Britain" "UKI"  0
    "Great Britain" "UKI"  0
    "Great Britain" "UKI"  0
    "Great Britain" "UKJ"  0
    "Great Britain" "UKJ"  0
    "Great Britain" "UKJ"  0
    "Great Britain" "UKJ"  0
    "Great Britain" "UKK"  0
    "Great Britain" "UKK"  0
    "Great Britain" "UKK"  0
    "Great Britain" "UKK"  0
    "Great Britain" "UKL"  1
    "Great Britain" "UKL"  0
    "Great Britain" "UKL"  0
    "Great Britain" "UKL"  0
    "Great Britain" "UKM"  0
    "Great Britain" "UKM"  0
    "Great Britain" "UKM"  0
    "Great Britain" "UKM"  0
    "Great Britain" "UKN"  0
    "Great Britain" "UKN"  0
    "Great Britain" "UKN"  0
    "Great Britain" "UKN"  0
    end
    label values a042 labc
    label def labc -5 "missing: other", modify
    label def labc -1 "don“t know", modify
    label def labc 0 "not mentioned", modify
    label def labc 1 "mentioned", modify
    For each region, identified by reg_nuts1, I would like to generate a new variable, which produces the percentage of observation scoring 1.

    If I use:

    Code:
    estpost tabulate a042
    mat list e(pct)
    mat b = e(pct)
    gen percentage=b[1,4]
    I would obtain a variable, percentage, which calculate the percentage of values 1 over the entire dataset. Instead, as I mentioned before, I need that percentage for each reg_nuts1. Any suggestion?

    Thanks in advance,

    Dario

  • #2
    The mean of a 0/1 variable is a proportion. Multiply that by 100 and you get a percentage. I see a -1 value for France. Is that a typo?

    Code:
    replace a042= abs(a042)
    bys country reg_nuts1: egen wanted= mean(a042*100)

    Comment


    • #3
      Andrew Musau Thanks. No that is not a type. Actually, the variable in the dataset can take up to five values: -5 (ans: missing); -2 (ans: no answer); -1 (ans: I do not know); 0 (ans: No); 1 (ans:Yes). So I do not have simply 0 and 1. I am afraid I posed my question in a too simplicistic form.

      Comment


      • #4
        I was trying something like that:

        Code:
        levelsof reg_nuts1, local(regions)
        foreach r of local regions {
        estpost tabulate a042 if reg_nuts=="`r'"
        mat list e(pct)
        mat b = e(pct)
        gen `r'=b[1,5]
        }
        
        sort reg_nuts1
        
        egen obedience = rowtotal(DE1-UKN)
        Then I realized that the percentage, i.e. pct may not necessary be in the fifth position of the row, since for some regions I have all fine answers, while for others I have 3 or 4 or 2 answer.

        Comment


        • #5
          It is not a big adjustment to the code. The issue is whether you would like to calculate percentages including the non 0/1 values or whether you would like to exclude them.

          Comment


          • #6
            Andrew Musau Yes. I want to include all values. I do not mind if someone answers "i do not know". Because each indivudal contributes to generate the total number of people in a region

            Comment


            • #7
              Including missing and don't know responses:

              Code:
              bys reg_nuts1: egen wanted= mean(cond(a042!=1, 0, a042*100))

              Comment

              Working...
              X